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Chapter  1 


Introduction 


1.1  Background  and  Motivation 


Current  approaches  to  robust  control  take  for  granted  the  availability  of  uncertainty  descriptions, 
e.g .,  parameters  lying  in  fixed  intervals  ( e.g .,  Barmish[5],  Biernacki  et  a/.[9])  or  frequency  domain 
(Woo)  bounds  (e.g.,  Safonov  et  a/.[60],  Doyle  et  a/.[16],  Francis  and  Zames[22]).  However,  the 
question  remains  as  to  how  these  descriptions  might  be  obtained  in  practice.  On  the  other  hand, 
the  identification  community  has  emphasized  estimation  of  nominal  models  without  developing  an 
associated  estimate  of  model  quality.  When  model  error  evaluation  has  been  carried  out,  this  usually 
accounted  only  for  random  effects  due  to  exogenous  inputs  rather  than  errors  due  to  inherent  model 
limitations  which  necessitate  a  robust  control  design,  e.g.,  Jenkins  and  Watts[31],  Ljung[49]  and 
the  references  therein.  There  is  now  a  greater  recognition  by  both  communities  of  the  requirements 
of  the  other.  This  recognition  is  evidenced  by  the  strong  interest  shown  by  researchers  from  both 
the  identification  and  robust  control  communities,  e.g.,  the  recent  Special  Issue  [35],  and  the  many 
sessions  on  this  topic  at  recent  conferences  and  workshops. 

Despite  this  research  activity,  this  subject  is  still  in  its  infancy  and  many  developments  are  likely 
to  arise  from  intensive  research  efforts  devoted  to  the  interaction  between  the  previously  separate 
fields  of  identification  and  robust  control.  To  fill  the  needs  of  robust  control  design  will  require  a 
new  approach  to  system  identification  which  provides  both  a  nominal  model  and  a  measure  of  its 
uncertainty.  We  refer  to  this  approach  as  “set-membership  identification”  or  “set  estimation.” 

The  long-range  goal  of  this  research  is  to  form  a  new  system  identification  paradigm  that 
fulfills  all  the  requirements  of  robust  control  design.  This  will  have  a  significant  impact  in  the 
engineering  community  where  such  an  “engineering  theory”  is  badly  needed.  Moreover,  with  the 
wide  availability  and  use  of  CACSD  packages,  such  as  MATRIXx  research  results  will  be  rapidly 
spread.  Since  system  identification  and  robust  control  design  are  ubiquitous  engineering  activities, 
the  benefits  of  this  research  will  be  widely  utilized,  particularly  among  control  engineers  involved 
with  aircraft,  spacecraft,  robotics,  and  industrial  automation. 

This  report  documents  our  research  efforts  which  concentrated  almost  exclusively  on  set-estimation. 
Some  effort  was  spent  on  the  important  next  step  of  robust  controller  design  using  the  estimated 
model  accuracy. 

In  the  remainder  of  this  chapter  we  provide  an  overview  of  the  issues  and  a  brief  summary  of 
our  results. 
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1.2  Model  Accuracy  Estimation 


As  expounded  by  Ljung(49],  identification  consists  of  three  essentia)  ingredients,  namely,  (i)  mea¬ 
sured  data,  (ii)  a  candidate  model  set,  and  (iii)  a  criteria  for  selecting  a  candidate  model  using  the 
data.  Moreover,  all  three  should  be  selected  based  on  the  intended  use  of  the  identified  model.  The 
problem  is  the  model  set  which  traditionally  consists  of  a  single  parametric  model.  There  is  no 
associated  parametrization  in  the  model  set  of  a  measure  of  uncertainty.  Thus,  the  designer  must 
guess  or  have  faith  in  the  identified  model  when  used  for  controller  design.  But  this  opposes  all  the 
standing  assumptions  made  in  current  robust  control  design  me+hods.  These  methods  require  a  set 
of  models,  not  a  single  model.  For  example,  a  model  set  can  consist  of  a  transfer  function  which 
depends  in  a  known  way  on  uncertain  parameters,  or  the  set  may  be  described  as  a  nominal  model 
together  with  a  frequency  dependent  “ball  of  uncertainty”. 

The  integration  of  control  design  and  identification  is  not  altogether  a  new  issue.  The  most 
familiar  and  appealing  application  is  adaptive  control  where,  as  shown  in  figure  1.1,  a  model  is 
identified  concurrently  with  the  on-line  optimization  of  the  control  law  based  on  the  model.  This 
leads  to  intricate  nonlinear  recursions  which  have  not  been  fully  understood  to  date.  There  are 
global  stabilization  schemes  which  are  not  robust;  there  are  local  stability  results  applicable  to  the 
steady-state,  and  hardly  anything  is  known  about  the  transient  behavior  of  adaptive  systems,  e.g., 
Astrom  and  Wittenmark[4],  Anderson  et  a/.[2], 

A  formulation  where  explicit  control  action  is  anticipated  for  the  purpose  of  identification  is 
the  so  called  “dual  control”  design,  e.g.,  Feldbaum[19],  Barslialom  and  Tse[6].  Due  to  the  high 
computational  requirements  associated  with  this  method,  implementation  is  only  possible  with 
crude  approximations  which  lead  to  similar  problems  as  with  the  adaptive  case. 

The  approach  we  have  been  pursuing,  illustrated  in  figure  1.2,  is  a  two  step  procedure,  where 
identification  produces  a  nominal  model  along  with  an  uncertainty  profile.  The  control  is  then 
designed  to  be  robust  with  respect  to  the  estimated  model  set.  This  results  in  an  iterative  solu¬ 
tion  where  models  and  control  are  adapted  to  the  changing  experimental  conditions.  This  differs 
considerably  from  the  classical  adaptive  control  scheme  (figure  1.1)  where  the  estimator  produces 
a  single  model  with  no  information  about  model  accuracy.  In  the  robust  control  design  procedure 
of  the  new  approach  (figure  1.2),  the  plant  model  is  replaced  by  a  model  set  which  reflects  the 
accuracy  with  which  the  model  has  been  estimated. 

In  the  work  described  here,  we  formulate  a  model  set  and  an  identification  criterion  from  which 
set-membership  identification  that  uses  time-domain  data  and  meets  the  requirements  of  robust 
control  design,  naturally  follows.  Specifically,  we  have  investigated  the  following  topics: 

1.  high  order  least-squares  set-estimation  with  ARX  model  sets. 

2.  robust  control  with  uncertain  ARX  model  sets. 

•3.  ellipsoid  sets  with  known  nonparametric  uncertainty. 

4.  robust  control  of  ellipsoid  sets. 

5.  £oo  identification. 

Before  we  discuss  the  results  of  our  efforts,  there  are  some  other  relevant  issues  to  clarify.  Specifi¬ 
cally,  the  character  of  uncertainty,  computation,  and  M1M0  systems. 
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Estimated  Parameters 


Figure  1.1:  Traditional  adaptive  control  system  with  parameter  estimator. 


1.3  On  the  Character  of  Uncertainty 


The  current  debate  amongst  researchers  involved  with  set-membership  identification  centers  on  the 
nature  of  the  set  itself:  is  it  probabilistic  or  deterministic/ worst-case.  Clearly  both  can  be  used  to 
quantify  uncertainty  in  either  disturbances  and  transfer  functions.  A  probabilistic,  or  stochastic, 
description  of  a  disturbance  is  common  practice  and  forms  the  basis  for  ^-filtering  and  control 
design,  t'.e.,  optimal  filtering  and  LQG  control  design.  A  power  bounded  set  of  disturbances  and/or 
a  deterministic/worst-case  description  of  transfer  function  uncertainty  leads  to  Hoo  methods  of 
control  design,  e.g Doyle  et  a/.[15j.  These  sets  can  be  combined  leading  to  mixed  'Hj/'Hoo  control 
design,  e.g.,  Khargonnekar  and  Rotea[33]. 

If  we  begin  with  a  stochastic  description  of  the  exogenous  inputs  to  a  system,  then  the  high- 
order  least-squares  based  identification  methods  described  in  section  2.2.2  lead  naturally  to  the  use 
of  a  probabilistic  set  to  describe  the  dynamic  uncertainty,  which  is  purely  parametric.  This  result 
immediately  raises  the  question:  what  does  a  robust  control  mean  in  the  context  of  probabilities? 
We  tend  to  think  of  a  robust  controller  as  providing  an  absolute  guaranty  against  instability  and/or 
certain  levels  of  performance  degradation  given  a  deterministic,  or  “hard  ”  bound  on  plant  uncer¬ 
tainty.  With  a  probabilistic  description,  or  “soft”  bound,  we  must  decide  if  99.99%  is  safe  enough. 
To  turn  the  question  the  other  way,  the  deterministic  bounds  necessitates  guarding  against  the 
worst-case.  But  conditions  for  the  worst-case  may  be  extreme,  thereby  leading  to  an  overly  con¬ 
servative  controller.  But  this  brings  us  back  to  exactly  the  question  of  probabilities  and  outcomes, 
and  finally  to  a  more  fundamental  question:  is  Nature  neutra’  or  conspiratorial? 

Attempting  an  answer  at  this  time  may  not  be  necessary,  nor  very  fruitful.  Our  philosophy  has 
been  more  pragmatic.  We  will  leave  it  be,  and  follow  the  consequences  of  different  assumptions 
by  developing  a  theory  of  set-membership  identification  and  corresponding  (as  necessary)  “robust” 
control  design  methods  compatible  with  both  probabilistic  and  deterministic  plant  sets.  In  this 
way  we  can  explore  without  prejudice. 


Model  Set 


Figure  1.2:  Adaptive  control  with  set  estimator. 


1.4  Competing  the  Estimate 

of  “least- 
is  always 
The  com¬ 
putational  methods  typically  involve  square-root  algorithms  such  as  the  QR  transformation,  SVD 
algorithms,  as  well  as  lattice  forms  for  very  high  model. orders.  It  is  imperative  that  the  calcula¬ 
tions  are  done  in  this  manner,  for  otherwise  significant  numerical  errors  will  accrue,  even  for  a  small 
number  of  parameters.  There  are  other  reasons  as  well  for  using  a  QR  method,  e.g.,  (1)  high  model 
orders  and  large  amounts  of  data  are  easily  handled,  (2)  data  from  different  experiments  are  readily 
combined  without  re-doing  the  entire  estimation,  and  (3)  prediction  errors  can  be  computed  for 
varying  model  orders  directly  from  the  QR  transformation.  These  factors  make  it  possible  to  easily 
and  rapidly  generate  extremely  high  order  models  from  large  amounts  of  data.  A  least  squares 
approach  to  set  estimation  will  naturally  benefit  from  all  the  existing  computational  theory  and 
software. 


The  computational  issue  is  very  relevant  to  system  identification.  The  great  appeal 
squares,”  and  the  principle  reasons  for  its  ubiquity,  are  because  a  unique  minimum 
obtained,  and  there  are  very  efficient  and  reliable  methods  for  computing  the  solution. 


1.5  MIMO  Extensions 


All  the  methods  discussed  haive  their  MIMO  extensions.  The  arguments  made  for  using  high-order 
ARX  models  of  SISO  systems  apply  equally  well  to  MIMO  systems.  Similarly,  the  Toeplitz  based 
methods  are  also  extensible  to  MIMO  systems.  So,  in  principal  the  methodologies  should  carry 
forward.  However,  issues  of  parametrization  can  become  very  important  because  with  too  many 
inputs  and  outputs,  the  number  crunching  can  get  out  of  hand.  Unfortunately,  extending  the 
parametric  robustness  tests  to  the  multivariable  case  is  not  solved. 
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1.6  Brief  Summary  and  Relation  to  Other  Approaches 


It  is  fair  to  say  that  many  of  the  ideas  discussed  here  for  set-estimation  have  been  influenced  by 
the  efforts  of  other  researchers  as  well  as  by  our  own  previous  success  and  failures.  In  what  follows 
we  give  a  brief  summary  of  some  of  the  recent  literature. 


1.6.1  Ellipsoid  Parameter  Bounds 

In  our  previous  work  on  set-membership  identification,  we  showed  how  to  obtain  a  set  of  models 
that  are  consistent  with  a  given  set  of  data  and  a  given  set  of  prior  assumptions  on  the  possible 
nonparametric  uncertainty  and  disturbances  see  section  2.3  for  a  brief  discussion;  more  details  are  in 
Kosut  et  a/.[36,  39,  47,  40]  and  the  papers  in  the  Appendix).  In  these  papers  the  model  narameters 
are  shown  to  lie  in  a  set  defined  by  a  quadratic  form,  i.e.,  an  ellipsoid  or  hyperboloid,  depending  on 
the  data  A  similar  approach  was  used  in  Youncc  and  Rohrs[70],  Wahlberg  and  Ljung[65].  Earlier 
versions  of  this  approach  based  on  least-squares  are  in  Kosut[40]  and  the  related  robust  control  of 
ellipsoid  bounded  plants  in  Lau  et  al.{4 5,  46,  44].  In  Wahlberg[65],  Laguerre  expansions  were  used 
to  model  the  dominant  system  dynamics  and  ellipsoid  bounds  also  obtained. 

The  difficulty  with  the  above  approaches  is  that  in  order  to  compute  the  ellipsoid  bound,  a  hard 
bound  on  the  non-parametric  dynamics  is  required,  which  unfortunately,  is  precisely  the  knowledge 
which  may  be  difficult  to  obtain.  Another  important  point  is  that  these  methods  are  based  on 
sufficient  conditions  to  satisfy  the  prior  'Hoc  bound,  hence,  the  sets  can  be  conservative.  In  fhe 
recent  work  of  Poola  et  a/.[57],  both  necessary  and  sufficient  conditions  are  established,  but  these 
are  used  for  model  validation. 


1.6.2  Stochastic  Embedding 

In  Goodwin  et  a/.[25,  26,  24]  a  stochastic  embedding  philosophy  is  adopted  (see  section  2.3.3  for 
a  brief  description).  It  is  assumed  that  both  the  unmodelled  dynamics  and  noise  are  drawn  from 
a  probabilistic  set  having  certain  amplitude  and  smoothness  properties.  These  properties  are  then 
estimated  by  maximum  likelihood  techniques  resulting  in  what  we  have  called  here,  a  probabilistic 
set-membership  estimator.  These  ideas  have  motivated  our  method  of  using  high-order  least-squares 
to  estimate  the  set.  The  use  of  high-order  least  squares  as  discussed  here  is  also  discussed  in  Kosut 
and  Anderson[37]. 


1.6.3  Model  Order  Reduction 

The  use  of  Laguerre  expansions,  as  mentioned  above,  may  prove  very  useful  in  our  high-order 
least-squares  approach,  because  the  orders  can  be  significantly  reduced  prior  to  LS  estimation,  e.g., 
Wahlberg[66,  67], 


1.6.4  Iterative  Identification  and  Control  Design 

Several  approaches  have  been  put  forward  which  involve  iterating  on  closed-loop  data  while  succes¬ 
sively  adjusting  data  filters  for  identification  and  redesigning  the  controller,  e.g.,  Schrama[.r>9],Lce 


et  o/.(48],  Zang  et  a/. [71] ,  Hansen  et  a/. [29],  Yam  ct  a/.[G9],  Kosut[41].  The  techniques  discussed 
here  for  set-membership  identification  are  a  necessary  part  of  these  schemes. 

1.6.5  Tien  Identification 

Several  researchers  have  considered  the  problem  of  identification  using  the  H0 0  norm  starting  from 
bounded  error  frequency  response  data  at  a  finite  set  of  frequencies,  e.g .,  Parker  and  Bitmead(56], 
Gu  and  Khargonnekar[32],  Helmicki  et  a/. [30].  Both  linear  and  nonlinear  algorithms  have  been 
developed  and  bounds  on  the  worst-case  identification  error  are  also  derived.  Although  there  are 
some  very  interesting  results  contained  in  this  work,  we  would  rather  start  from  time  domain  data, 
which  really  is  the  source  of  frequency  domain  data  in  the  first  place.  The  new  methods  of  C0 0 
identification  described  in  2.4  and  Massoumnia  and  Kosut[51]  (see  Appendix)  may  prove  to  be  a 
more  direct  approach  to  this  problem. 

1.6.6  Set-Membership  Validation 

A  related  problem  to  set-membership  identification  is  that  of  model  set  validation.  In  Smith  and 
Doyle[63,  62],  the  following  model  validation  question  is  posed:  “Given  experimental  data  and  a 
model  with  both  additive  noise  and  norm  bounded  perturbations,  is  it  possible  that  the  model  could 
produce  the  observed  input-output  data?”  This  question  is  a  first  step  towards  the  reconciliation 
of  prior  assumptions  on  disturbance  and  model  accuracy  with  observed  data  from  a  system  .  The 
approach  is  based  on  frequency  domain  data  with  a  /i-like  model  structure. 

In  Poola  el  a/.[57],  the  model  validation  problem  is  posed  using  time-domain  data  and  both 
necessary  and  sufficient  conditions  are  obtained  for  model  validation,  and  hence,  invalidation.  Our 
previous  work  on  set-membership  identification  used  only  the  sufficient  conditions  to  produce  the 
ellipsoidal  sets.  The  underlying  theory  in  Poola  et  a/.[57],  which  provides  both  the  necessary  and 
sufficient  conditions  for  consistency,  is  based  on  certain  Toepiitz  forms.  There  are  some  similarities 
with  the  Toepiitz  forms  used  the  new  Coo  identification  methods  uiscussed  in  section  2.4. 


Chapter  2 


Set-Membership  Identification 


In  this  section  ./e  give  an  overview  of  the  fundamental  problem  of  set  estimation  and  a  detailed 
summary  of  our  own  contributions.  The  complete  details  of  our  work  is  contained  in  several  papers 
which  are  included  in  this  report  as  an  Appendix. 


2.1  Problem  Formulation 

To  illustrate  the  issues,  suppose  that  the  true,  but  unknown  system  to  be  controlled  is  the  single- 
input-single-output  stable  discrete-time  system, 

S:{y~Gu+Ile  je€E(A)}  (2.1) 

where  G  and  H  are  unknown  causal  linear-time-invariant  (LTI)  systems  ;th  transfer  functions 
G(z)  and  H(z ),  respectively.  The  sequences  y  and  u  are,  respectively,  the  sensed  output,  and  the 
applied  control  input.  The  sequence  e  is  unpredictable  except  known  to  be  in  a  set  E(A)  where  A 
is  unknown.  Likely  candidates  for  E(A)  are  Epou,(A),  the  set  of  sequences  with  power  bound  A,  or 
Ell(f(A),  iid  zero-mean  sequences  with  variance  A.  For  robust  control  design,  it  is  necessary  to  have 
a  set  description  of  the  plant  system.  For  example,  consider  the  set1 

M  :  {y  =  (G  +  AW)u  +  He  |  ||A||Woo  <  7>  e  G  E(A) } 

If  E(A)  =  Epo,„(A),  then  M  is  typical  for  Hoo  control  design.  If  E(A)  =  E,-,rf(A),  then  mixed  'H.il'H oo 
control  design  methods  apply.  There  are  many  combinations  possible.  However,  in  all  the  above 
cases,  the  quantities  with  “hats”  are  available  a  priori  to  the  designer.  The  problem  addressed 
here,  referred  to  as  set  estimation ,  is  to  determine  these  quantities  a  posteriori  from  the  finite  data 
record , 

{yt,nt  1 1  -  i,...,yv} 

where  yt  and  Ut  are  the  values  of  the  sequences  y  and  u,  respectively,  at  time  t.  In  the  remainder 
of  this  section,  some  of  the  issues  involved  in  set  estimation  are  discussed  and  some  promising 
methods  recently  proposed  are  reviewed.  More  details  on  these  specific  techniques  can  be  found  in 
the  special  issue  (35]  and  the  references  therein. 

‘If  A  is  stable,  ||A||,<oo  =  |A(eJU,)|,  otherwise,  ||A|!?1oo  “ 


I 


2.2  Least-Squares  Parameter  Estimation 


Least-squares  (LS)  methods  of  parameter  estimation  enjoy  a  very  wide  usage,  and  the  underlying 
theory  is  well  developed,  especially  in  a  probabilistic  framework.  In  section  2.2.2  we  show  that 
the  LS  estimator  together  with  high-order  ARX  models  lead  naturally  to  transfer  function  uncer¬ 
tainty  which  is  parametric.  Moreover,  the  parameter  uncertainty  can  be  either  probabilistic  or 
deterministic,  depending  on  prior  assumptions. 

Parametric  uncertainty  has  proven  much  more  difficult  for  robust  control  design  than  the  non- 
parametric  dynamic  uncertainty  associated  with  methods.  However,  as  discussed  here,  the 
parametric  uncertainty  set  produced  by  high-order  least-squares  seems  to  be  quite  tractable  and 
leads  to  some  new  approaches  to  robust  control  design  (section  2.2.3). 

The  high-order  ARX  model  sets,  although  compatible  with  the  assumptions  in  the  LS  theory, 
can  be  viewed  as  an  intermediate  step  to  encoding  the  data  into  a  model  more  suitable  for  robust 
control  design.  To  reduce  the  model  order,  we  have  examined  the  use  of  Laguerre  expansions 
(section  2.2.4)  before  LS  is  applied.  The  selection  of  the  Laguerre  kernels  may  have  to  be  based  on 
a  priori  information,  or  depend  on  a  desired  closed-loop  bandwidth. 

2.2.1  Statistical  Analysis 

Parameter  estimation  via  least-squares  with  an  ARX  model  is  perhaps  the  most  widely  used  ap¬ 
proach  to  system  identification.  Consider  the  parametric  ARX  model  set: 

M  :  {Aey  =  Beu  +  e  |  9  G  1RP,  e  €  E„j(A)}  (2.2) 

where 

A-e 

9 

Thus, 

yt  -  <f>J  9  +  et 

<t>T  =  [  — Jh-l - Pt-n  Uf_i  -  •  -u(_m] 

The  least-squares  parameter  estimate,  based  on  a  finite  data  record,  is  found  from: 

1  N 

&  —  argmjn  —  ^(yt  —  0T  4>t)2  (2.3) 


n  m 

=  i  +  XX*_\  Be  =  Ylb'z~' 

x=i 

=  [ai • ■ -an  bi  •■■bm]T 


It  is  well  known  (Ljung[49])  that 


where 


B  —*  0' ,  as  N  — »  oo,  w.p.  1 


1  [* 

9"  -  argmin  —  /  SrTT(u>,  9)duj 
0  JLir  J-n 
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with  the  “error”  spectrum  given  by, 


S„rr(u;,0)  =  \Ae(c’“)0'(c>“)  - 

+  A\McJ“)H(cJU)\2 

In  addition,  if  Ihe  system  (2.1)  is  in  the  ARX  model  set  (2.2),  then  the  parameter  error  8  —  9’  is 
asymptotically  normally  distributed,  i.e.,  as  N  — ►  oo, 

(2.4) 


where  £{■)  denotes  expectation.  Observe  that  the  system  (2.1)  is  in  the  ARX  model  set  if  there 
exists  a  parameter  0O  such  that,  G  =  Bg0/Ag0  and  //  =  1  /Ag0.  Although  this  is  not  true  in  general, 
the  true  system  can  be  arbitrarily  well  approximated  by  a  high  order  ARX  model.  Specifically, 
set  n  =  m.  Then,  there  is  a  sufficiently  large  value  of  n  and  a  corresponding  parameter  0O  6  IR2n 
such  that  \\H~lG  —  Bg0  jjH^  and  ||//-1  -  A$0||Woo  are  arbitrarily  small.  Hence,  for  some  sufficiently 
large  values  of  N  and  n,  reasonable  estimates  of  £{<f>t<i>J)  and  A,  are 


with 


et  =  Vt~  <f>JO 


the  estimated  prediction  error.  The  above  asymptotic  approximations  inspire  several  types  of 
high-order  ARX  set  estimators. 


2.2.2  High-Order  ARX  Sets 


Let  G  =  B/A  and  H  =  1  /A  denote  LS/ARX  estimates  of  G  and  II.  Let  m  =  n  where  n  is  large. 
Then,  the  true  system  is  well  approximated  by, 


Ay  —  Bu  —  ST  R  1/,2<£  +  e 

(2.5) 

where  6  £  IR2n  is  the  normalized  (unknown)  parameter  error: 

6  =  R^2{0-0) 

(2.6) 

Since  e  €  E,„*(A),  for  large  N,  we  have  the  following  approximate  statistical  properties: 

6  €AT(0,-^/2n),  (7V-2n)^GX2(^-2n) 

Therefore, 

N  9 

y€x2(p),  ^-■eF(2n,JV-2n) 

where  F(2n,  N  -  2 n)  is  the  F-distribution  with  degrees  of  freedom  2n  and  N  —  2 n.  Hence, 

Prob{<5T/  <  — «A}  =  tj 

can  be  determined  from  an  /'-distribution  table.  To  be  safe,  suppose  we  set  77  very  high,  say, 
rj  —  .999.  Then  for  typical  numbers  such  as  N  >  1000  and  n  —  10,  we  get  a  —  2.27.  For  large  n. 


say  n  =  60,  and  large  ^  »  n,  we  get  a  ss  1.45,  and  so  on.  In  addition,  for  large  N ,  e  €  E(A). 
Hence,  for  large  n  and  large  N ,  the  system  (2.1)  is  in  the  model  set 


^iarr 


Ay  =  Bu  —  6r R  l/2(f>  +  e 
-  Sr6  <  fta\ 

.  e  €  E„d(A) 


(2.7) 


with  probability  of  at  least  rj. 

It  is  interesting  to  compare  the  above  probabilistic  result  with  a  strictly  deterministic  view.  For 
example,  the  orthogonality  properties  of  the  least-squares  estimator  give: 

N 

6t6  =  ^2  e2  -  (N  -  2n)A 

«=i 

This  property  requires  no  probabilistic  assumptions  on  the  data.  Hence, 

i  N  ^ 

—  ^  e2  <  f?  =>  <  A^(r/  —  A)  +  2nA 


t=i 


The  estimate  A  is  a  possible  choice  for  T)  which  gives  a  result  very  similar  to  that  above. 


2.2.3  Robust  Control  with  ARX  Sets 


In  this  section  we  discuss  the  issue  of  robust  control  design  under  the  assumption  that  the  true 
system  is  in  the  ARX  model  set  Mor.x  of  (2.7).  Suppose  we  apply  the  LTI  feedback  controller 

u  =  -Ky  (2.8) 

where  K  stabilizes  the  “nominal”  ARX  system  (£  =  0), 

Ay  =  Bu  +  e 


Applying  the  control  to  the  actual  system  model  (2.5),  gives  the  closed-loop  system 


y 

Tse  ‘ 

1 

fe  ' 

u 

-Qse 

"  1  -  6Th 

- Qe 

where 


T  = 


A  +  BI< 
h  =  R-1'2 


Q 


A  -f  Bl< 


z  1 

'  DT 

II 

Q 

, 

DQ 

.  2“n  . 

Because  K  stabilizes  the  nominal  system,  T,  Q  and  h  are  all  stable. 

Recall  from  the  Nyquist  theorem  that  since  h  is  stable,  the  closed-loop  system  is  stable  if  and 
only  if, 

|1  -  f>Th(c>w)\  f  0,V6T£  <  p1^ u 
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This  is  equivalent  to 


P  <  Pstab 


where  pstab,  the  so-called  “real”  stability  margin  is  given  by, 

pLb  =  'nf  =  r(w)  I  <5i  b(e]W)  =  -1  } 

u  r(w)  I  J 

Calculating  r(u>)  involves  finding  the  minimum  norm  (least-squares)  solution  to  the  over-determined 
set  of  equations  6Th(e^ul)  =  1  at  each  frequency.  Thus, 

_  J  i/IIWI2  -  (“T‘) 2/  Ml2].  *  *  o 

~  t  1/  Ml2  ■  »  =  « 

where 

a  =  Re  fc(ci"),  b  =  Im 

Hence,  a  “probability  of  stability”  can  be  stated  as  follows.  If 

Prob{6T£  <  p2}  =  r] 


then 


p  <  pstab  =»  Prob{(l  -  1  stable}  >  rj 

It  ought  to  be  mentioned  that  no  closed  form  solution  is  known  for  the  stability  margin,  pst»b,  in 
the  MIMO  case. 


2.2.4  Order  Reduction  via  Laguerre  Expansions 

Although  high-order  ARX  model  set  estimation  seems  promising,  there  are  some  obvious  impedi¬ 
ments.  First,  the  controller  (2.8)  will  also  be  of  high  order.  Secondly,  a  determination  of  what  is 
meant  precisely  by  high  order  is  dependent  on  a  priori  knowledge  about  the  true  system.  Thirdly, 
the  statistical  properties  are  based  on  very  large  data  lengths,  and  again,  a  precise  value  depends 
on  the  true  system  properties. 

To  offset  the  high  order,  an  alternative  is  to  use  a  more  parsimonious  model  parametrization. 
For  example,  using  Laguerre  expansions,  as  proposed  in  Wahlberg  and  Ljung[65],  may  result  in 
considerably  fewer  parameters  to  obtain  the  same  level  of  approximation  as  a  model  expanded 
in  the  backward  shift  operator  z~1.  However,  the  efficacy  of  this  approach  depends  on  prior 
information  regarding  the  accuracy  of  some  dominant  pole  locations.  The  basis  for  the  Laguerre 
expansions  is  the  fact  that  for  any  stable  transfer  function  T(z),  and  any  a  6  (  —  1,1),  there  is  a 
unique  bounded  real  sequence  a,  such  that 


OO 

T(z)  ~  ^2QkLk(z,a) 


k=l 


where 


Lk{z,a)  = 


y/l  -  a2 
z  -  a 


Jfc-i 


Observe  that  for  a  =  0,  Lfc(z,0)  =  z'K,  which  returns  the  usual  expansion  in  the  delay  z  1 .  The 
appropriate  order  of  the  expansion  depends  on  the  convergence  properties  of  the  partial  sums.  For 


1  1 


example,  has  a  Laguerre  expansion  of  order  n  =  1,  provided  that  a  -  p.  Since  typically,  p 
is  not  known,  a  good  choice  of  n  will  depend  on  prior  knowledge  of  p.  For  ARX  models,  replace 
H~l  and  H~XG  in  the  shift  operator  with  Laguerre  expansions.  To  pick  a  good  Laguerre  kernel 
requires  either  prior  knowledge  or  else  some  data  dependent  means  of  selection.  Another  possibility 
is  to  select  the  kernel  to  reflect  the  desired  closed-loop  bandwidth. 

An  affine  model  set,  e.g.,  a  Laguerre  expansion  for  G,  can  also  offset  the  issue  of  determining 
what  is  meant  by  a  large  data  length.  With  this  model,  it  is  possible  to  precisely  compute  statistical 
properties  without  the  need  for  either  large  model  orders  or  large  data  lengths,  e.g.,  Kosut  and 
Anderson[34].  However,  another  useful  asymptotic  property,  also  true  for  ARX  models,  is  that  if 
the  input  is  white,  then  the  first  m  impulse  response  coefficients  of  G  are  asymptotically  unbiased, 
where  m  is  the  order  of  B$.  Other  useful  results  follow  from  this  fact,  e.g.,  Aling  and  Kosut  [1]. 


2.3  Ellipsoid  Set-Membership  Identification 

2.3.1  Uncertain  Non-parametric  Dynamics 

When  an  upper  bound  on  the  nonparametric  model  errors  is  known  from  prior  knowledge,  it  is 
possible  to  compute  a  parameter  set  which  is  consistent  with  the  data.  Depending  on  the  data,  the 
parameter  set  is  either  an  ellipsoid  or  an  hyperboloid.  In  the  latter  case  the  data  is  considered  to  be 
“bad”,  that  is,  the  spectral  content  of  the  data  is  concentrated  too  heavily  at  those  frequencies  where 
the  nonparametric  dynamics  dominate.  Thus,  an  ellipsoid  indicates  “good”  data  and  there  are 
several  schemes  for  minimizing  the  size  of  these  ellipsoids.  Computation  of  the  bounding  ellipsoids 
is  virtually  no  different  than  standard  least-squares  computations  and  can  be  accomplished  in  a 
batch  or  recursively.  We  plan  to  investigate  efficient  methods  in  our  future  work.  Various  kinds 
of  prior  information  can  also  be  included  using  the  bounding  ellipsoid  approach.  Some  of  these 
computational  problems  are  generic,  not  specifically  for  robust  control  and  identification,  and  are 
surveyed  by  Deller[14]. 

To  see  the  main  result  more  clearly,  we  can  state  the  problem  as  follows:  Use  the  measured 
input/output  data 

{  yt,ut  |  t  =  }  (2.9) 

to  obtain  a  model  set  suitable  for  robust  control  design.  To  do  this  we  need  to  make  some  as¬ 
sumptions.  The  first  is  that  the  system  which  produces  the  data  is  disturbance-free  and  linear  time 
invariant.  Thus, 

y  =  Gu  (2.10) 

where  G  has  the  (discrete-time)  transfer  function  G(z).  Assume  also  that  the  true  system  is  a 
member  of  the  model  set 

G  =  {  Go(  1  +  A GWG)  :  9  e  ©prior,  II AG||Hoo  <  1  }  (2.11) 

Thus,  the  model  set  consists  of  parametrized  models  with  a  multiplicative  nonparametric  error 
bounded  by  Wg(z).  The  set  0pr,or  represents  the  prior  information  by  which  the  parameter  vector 
is  confined.  We  further  characterize  the  parametric  transfer  function  by  using  the  standard  ARX 
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form  in  Ljung[49]: 


The  result  in  Kosut  et  a/.[39] 
is  the  following 


Ge(z)  = 

Bg(z)/Ag(z) 

Be(z)  = 

biZ~l  +  . . .  + 

Ae(z)  - 

1  +  a\z~ 1  +  . 

9  = 

(  a}  ...  an  b 

(2.12) 


>-» 


which  forms  the  basis  for  the  parameter  set-membership  estimation. 


Theorem  2.3.1  Under  the  assumptions  stated  above,  all  parameters  which  are  consistent  with  the 
measured  data  and  the  prior  information  are  in  the  set 

©prior  ©  ©t vc 

where  the  “worst  case  equation  error  set”  Qwc  is  defined  by 

Qwc  =  {  9  €  1RP  :  \\A«y  -  Beu\\N  <  \\B,Wou\\N  } 


(I|i||/V  =  {YltLi  xj xty12  is  the  usual  tYnorm  on  t  €  [1,  A]-)  The  motivation  for  the  term  “worst 
case”  refers  to  the  fact  that  the  nonparametric  uncertainty  A g  will  take  on  the  worst  possible  value 
such  that  HAgIIhoo  <  1.  The  set  can  be  easily  computed  using  least-squares  methods  and  may  be 
a  hyperboloid,  ellipsoid  or  the  empty  set  depending  on  the  data  (see  Kosut  el  al.[39]).  Thus,  the 
true  system  is  guaranteed  to  be  in  the  set: 

G  =  {  Ge(  1  +  A GWG)  :  9  6  ©prior  n  Qwc ,  ||Ac||«»  <  1  }  (2-13) 

Instead  of  multiplicative  model  errors,  we  have  also  considered  additive  model  error  sets,  i.e., 

G  =  {  Gg  -f  AgWq  :  6  €  ©prior )  HAgIImoo  <  1  )  (2-14) 

The  resultant  parameter  set  is  then  given  by 

0u,c  =  {  9  g  1RP  :  ||  Aey  -  } 

Several  other  model  error  formulations  can  be  used,  e.g.,  inverse  multiplicative,  feedback  and 
coprime  factored.  We  will  not  discuss  them  here,  but  merely  state  them  to  indicate  that  this  is  a 
versatile  approach  which  allows  various  kinds  of  prior  information.  More  specific  details  and  results 
using  the  set-membership  approach  are  described  in  Kosut  et  a/. [36],  a  copy  of  which  is  contained 
in  this  report  as  an  Appendix. 

2.3.2  Robust  Control  Design  of  Ellipsoid  Sets 

As  a  first  step  in  using  the  ellipsoidal  parameter  set  information,  we  simplified  the  robust  control 
design  problem  to  the  case  of  FIR  plants  in  an  ellipsoidal  set.  Details  can  be  found  in  Lau  et 
a/. [45,  46)  which  describe  the  continuous-time  and  discrete-time  cases,  respectively.  Copies  of  these 
papers  are  contained  in  the  Appendix. 
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We  started  with  the  simplifying  assumption  that  the  plant  state-space  description  depended  on 
uncertain  parameters  in  the  output  matrix  which  are  only  known  to  lie  in  an  ellipsoidal  set.  The 
desired  control  is  chosen  to  minimize  the  maximum  linear  quadratic  regulator  (LQR)  cost  from  all 
plants  with  parameters  in  the  given  set.  Although  no  a  prior  form  is  assumed  for  the  minimax 
control,  it  turns  out  that  it  is  the  LQR  control  for  one  of  the  plants  in  the  set,  the  worst-case 
plant.  By  defining  an  appropriate  operator  mapping  an  element  from  the  given  ellipsoidal  set  to 
an  element  of  the  same  set,  the  existence  of  this  worst-case  plant  is  proved.  A  simple  algorithm  is 
used  to  compute  the  worst-case  plant. 

The  assumption  that  the  output  matrix  in  the  plant  description  contains  all  the  uncertainty 
deserves  further  discussion.  First,  this  is  a  natural  extension  of  the  discrete  FIR  finite-horizon 
problem  solved  in  Lau  et  al.[46].  In  the  continuous  case,  Laguerre  models  can  be  used  so  that  the 
identification  is  reduced  to  estimating  the  Laguerre  coefficients  (see  Wahlberg[64j).  Uncertainty  in 
the  Laguerre  coefficients  can  then  be  described  by  set  membership  of  the  output  matrix.  Second, 
by  limiting  uncertain  parameters  to  the  output  matrix,  we  simplify  the  analysis  and  can  gain  more 
insights  than  if  we  had  included  parameter  uncertainty  in  the  plant  dynamics  also. 

Specifically,  we  consider  the  following  family  of  systems 

x(i)  =  Ax(t)  +  bu(t),  z(0)  =  x0  (2.15) 

2/(C)  =  cTx(t),  (2.16) 

where  A,  b ,  and  x0  are  fixed  and  given,  and 

c  €  0  =  {fl  :  (6  -  6c)tR(0  -  9C)  <  1,  R  =  Rr  >  o}  .  (2.17) 

For  a  given  control  u  :  IR+  —  El  and  a  fixed  c  €  0,  the  LQR  cost  is  defined  to  be 

fOO 

J(u,c)=/  [ru(t)2  +  y(t)2)  dt.  (2.18) 

Jo 

We  assume  that  (A,  b )  is  controllable  (or  at  least  stabilizable)  and  (c,  A)  is  observable  (or  at  least 
detectable)  for  all  c  in  0.  The  robust  control  design  problem  is  to  find  a  control  u  that  solves  the 
following  minimax  problem: 

minmax  J(u,c).  (2.19) 

U  c£& 

Since  no  a  prior  form  is  assumed  for  the  control  u,  such  as  linear  state-feedback,  the  minimization 
in  (2.19)  is  over  all  possible  u  :  ]R+  — *  IR.  Note  also  that  we  chose  the  initial  time  t  =  0  for 
convenience  only,  the  problem  can  be  posed  at  any  initial  time  t  =  t0.  Therefore,  one  can  design  a 
new  controller  each  time  0  gets  updated. 

The  cost  objective  in  (2.18)  and  the  ellipsoidal  set  in  (2.17)  lead  to  another  interesting  inter¬ 
pretation  for  the  minimax  problem  in  (2.19)  once  we  rewrite  (2.18)  as 

roo 

J(w,c)=  /  [ru(t)2  +  xT(t)ccTx(t)]  dt.  (2.20) 

Jo 

Now,  instead  of  saying  that  we  are  designing  a  controller  for  a  set  of  plants  described  by  (2.15) 
through  (2.17),  we  can  also  say  that  we  are  designing  a  controller  for  a  set  of  objective  functions. 
This  interpretation  contrasts  with  the  standard  LQR  design  where  one  controller  is  obtained  for 
the  selected  weighting  matrices.  Therefore,  the  minimax  control  from  (2.19)  is  less  sensitive  to 
how  the  states  are  penalized  in  the  cost.  This  kind  of  control  design  method  should  be  applicable 
to  many  practical  situations  as  we  seldom  know  exactly  how  much  one  state  should  be  weighted 
against  another. 
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2.3.3  Comparison  with  Stochastic  Embedding 

In  the  work  of  Goodwin  et  a/.[24,  25,  26]  a  stochastic  embedding  philosophy  is  adopted  which  makes 
no  assumptions  on  model  order  or  data  length.  It  is  assumed  that  both  the  unmodelled  dynamics 
and  noise  are  drawn  from  a  probabilistic  set  having  certain  amplitude  and  smoothness  properties. 
These  properties  are  then  estimated  by  maximum  likelihood  techniques  resulting  in  a  set  estimator. 

Since  all  the  trouble  is  related  to  that  part  of  the  system  which  is  not  modeled,  i.e.,  the  “bias,”  it 
makes  no  sense  to  try  to  estimate  the  bias  in  the  form  of  a  parametrized  model.  That  is  tantamount 
to  an  additive  high  order  plant  model  component  which  should  have  been  incorporated  in  the  plant 
model  in  the  first  place,  e.g.,  high-order  ARX  model  sets. 

To  see  the  main  idea,  assume  that  the  true  system  is  described  by  (2.1),  and  an  estimate  G^  of 
G  has  been  obtained  from 

1  N 

9  =  arg  nun  —  ]£(y  -  Geu)2t 

Since  the  model  structure  is  incompatible  with  the  true  system,  G~  will  be  a  biased  estimate  of  G. 
We  now  make  the  assumption  that  the  true  system  is  the  sum  of  a  model  in  the  model  set  and  a 
bias  term  which  has  an  expectation  value  of  zero: 

G(z)  =  G9o(z )  +  A (z)  with  £  {A(*)}  =  0 

Here,  the  expectation  is  not  taken  over  the  data  probability  space,  but  over  the  unknown  bias  model 
set.  In  other  words,  the  complicated  problem  of  relating  the  bias  to  the  data  and  the  mismatch 
in  structure  of  the  true  system  and  the  model  is  avoided  by  simply  assuming  that  the  bias  model 
is  a  zero-mean  random  variable.  By  modeling  the  bias  in  this  rudimentary  form,  a  bias  model  set 
parametrization  is  obtained  which  is  described  by  a  small  number  of  parameters,  yet  is  capable  of 
representing  a  large  set  of  error  models. 

As  an  example,  assume  that  the  expectation  of  the  squared  bias  model  impulse  response  is 
exponentially  decaying: 

OO 

A(z)  =  '^/3lz-t  ,  £{j if}  =  ap‘ 

i=i 

where  0  <  p  <  1.  Thus,  the  bias  model  set  is  described  by  only  two  parameters  a  and  p.  With 
some  additional  assumptions,  e.g.,  gaussianity  and  Go  an  affine  Laguerre  expansion,  an  explicit 
formula  of  the  Fisher  information  matrix  can  be  derived  which  forms  the  basis  for  an  optimization 
procedure.  Hence,  the  two  parameters  which  describe  the  general  shape  and  size  of  the  less  certain 
part  of  the  system  model  can  be  directly  estimated  from  the  data. 


2.4  Coo  Identification 

In  this  section,  a  new  criterion  for  system  identification  is  introduced,  which  we  loosely  call  £«*,- 
identification.  At  the  present  time,  very  little  is  known  about  this  approach,  and  hence,  we  can 
only  guess  about  the  consequences  for  set-membership  identification  and  the  corresponding  robust 
controller  design.  However,  Like  LS,  this  approach  also  leads  to  solving  a  convex  optimization  prob¬ 
lem.  Unlike  LS,  it  does  not  appear  at  this  time  that  the  solution  can  be  expressed  in  closed-form. 
However,  the  criterion  is  a  convex  function,  so  therefore,  numerical  methods  will  reliably  compute 
the  solution,  specifically,  interior  point  methods.  In  the  future  we  hope  to  further  understand 


the  properties  of  this  estimator  and  develop  reliable  computational  methods.  Hopefully,  this  new 
methodology  will  result  in  more  natural  set  estimators  suitable  for  robust  control  design. 

The  parametric  approach  to  system  identification  is  based  on  selecting  an  appropriate  model 
structure  and  a  search  for  the  parameters  of  the  model  that  best  describes  the  data.  Usually,  the 
best  model  within  the  model  set  is  characterized  as  the  one  that  minimizes  a  selected  norm  of  the 
prediction  errors,  usually  the  2-norm.  In  this  section  a  new  norm  is  introduced.  Minimizing  this 
norm  is  equivalent,  asymptotically,  to  minimizing  the  supremum  of  the  spectrum  of  the  prediction 
error  over  all  frequencies,  or  equivalently  minimizing  its  Coo  norm. 

Given  a  scalar  finite  sequence  {e^,  t  =  1, . . . ,  N}  which  represents  the  prediction  errors  computed 
from  the  observed  data  and  a  guessed  model  parameter  vector  6.  Based  on  this  sequence,  form  the 
(N  +  M  -  1)  x  M  matrix, 


Enm  — 


sfN 


ei 

0 

0 

ei 

0 

ejw 

<?M- 1 

e\ 

eN 

e/v-i 

eN-M+ 1 

0 

ew 

•  •  '  tN-M+2 

0 

0 

Cn 

(2.21) 


with  1  <  M  <  N .  Note  that  Enm  is  constant  along  the  diagonals,  and  for  M  =  1,  En\  is  a 
column  vector  with  e{/y/N  as  its  elements.  Denote  this  vector  by  E jy.  Hence,  the  matrix  Enm  >s 
completely  specified  when  En(—  Em)  is  given. 

Define  the  new  norm  as  the  maximum  eigenvalue  of  EJjmEnm, 


Vm{En)  =  A  (eJjMEmm}  =  o2 


(2.22) 


where  A (F)  denotes  the  maximum  eigenvalue  of  F  and  o{F)  denotes  the  maximum  singular  value 
of  F.  Note  that  for  M  =  1,  Vm(En)  is  the  usual  quadratic  norm.  ;,From  Grenander  and  Szego[28], 
we  obtain  the  following  limiting  properties: 


T* 

lim  E^En  = 

N  — *oo 

(2.23) 

lim  (  lirn  o2(Enm))  — 

M-+oo  N  —>oo 

sup  5«(w) 

M  <  n 

(2.24) 

lim  (  lim  ct2(Enm ))  = 

Af— »oo  N  —*oo 

inf  SeJu>) 

M<7r 

(2.25) 

where  we  assume  that  N  goes  to  infinity  faster  than  M . 

Relation  (2.24)  is  very  illuminating  and  shows  that  by  minimizing  Vm  as  M  approaches  infinity, 
the  supremum  of  the  spectrum  of  the  prediction  error  over  all  frequencies  is  minimized.  Because  of 
this  property,  we  referred  to  the  identification  problem  using  the  new  norm  as  the  Coo  identification 
problem.  In  contrast,  by  minimizing  the  usual  quadratic  norm,  the  integral  of  the  spectrum  of 
prediction  error  over  all  frequencies  is  minimized  (see  Ljung[49]),  and  this  can  be  referred  lo  as 


lb 


identification  problem.  It  seems  plausible,  that  this  norm  is  potentially  very  useful  for  robust 
;rol  design.  More  details  can  be  found  in  Massoumnia  and  Kosut[51]  which  is  included  in  the 
iendix. 
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Chapter  3 

The  Future:  A  Graphical  User 
Interface  for  System  Identification 


A  long  range  objective  of  the  present  work  is  the  development  of  some  mathematical  and  compu¬ 
tational  tools  that  are  appropriate  to  the  next  generation  of  CACSD  (Computer  Aided  Control 
System  Design)  environments.  These  future  CACSD  packages  will  be  radically  different  from  the 
present  packages  in  that  they  will  truly  be  able  to  perform  control  systems  synthesis  and  rapid 
prototyping,  rather  than  just  analysis  and  simulation. 

In  our  view  of  the  future,  the  engineer  will  commence  the  design  with  uncertain  and/or  in¬ 
complete  'nformation  consisting  partly  of  prior  knowledge,  measured  data,  and  a  set  of  closed-loop 
design  objectives  and  constraints.  Once  this  information  is  fed  into  the  CACSD  program,  it  will  in 
turn  generate  controllers  that  meet  the  performance  requirements  while  respecting  the  constraints, 
or  else  inform  the  engineer  that  the  constraints  cannot  all  be  satisfied,  suggest  some  trade-offs 
as  well  as  alternative  experiments  to  obtain  data  which  may  reduce  uncertainty.  As  the  engineer 
thinks  of  more  constraints  and  requirements,  and/or  obtains  more  data,  these  are  entered  into  the 
computer  and  are  accounted  for  as  they  are  entered.  Thus,  the  CACSD  process  is  still  interactive, 
but  the  level  of  interaction  with  the  computer  is  much  higher  than  it  is  at  present.  Moreover,  the 
interactive  use  of  real  data  would  be  much  more  possible  than  at  present. 

In  order  for  this  ideal  situation  to  come  about,  it  is  necessary  first  to  solve  some  important  math¬ 
ematical  and  computational  problems  residing  in  the  interface  between  controller  implementation 
on  the  actual  system  and  controller  design  based  on  a  model  of  the  system. 

System  identification  is  a  typical  example  of  an  iterative  inter-active  orocedure  where  several 
results  have  to  be  computed,  analyzed  and  re-iterated  again  with  modified  design  parameters.  In 
order  to  do  this,  the  user  repeatedly  has  to  enter  a  sequence  of  commands  for  computing  frequency 
responses,  spectral  density  functions  and  prediction  error  norms.  Even  in  high-level  interactive 
CACSD  programs  like  MATRIXx  and  MATLAB  it  is  virtually  impossible  to  execute  this  procedure 
without  having  to  write  command  files  for  each  specific  task.  Figure  3.1  shows  typical  paths  and 
functions  in  the  MATRIXx  system  identification  environment.  Instead  of  concentrating  on  the 
design  task,  the  user  is  mainly  occupied  with  designing,  organizing  and  maintaining  a  large  number 
of  specific  programs  for  standard  procedures.  As  a  conclusion,  the  current  CACSD  software  is 
inadequate  for  most  users,  both  in  the  sense  of  user-friendliness  and  software  design  capabilities. 

As  an  example,  at  Integrated  Systems  Inc.  (IS1),  we  have  recently  introduced  the  XMATH 
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Figure  3.1:  MATRIXx  System  Identification  command  overview 

product,  which  provides  an  ideal  platform  for  the  development  of  portable  window-based  CACSD 
software  such  as  system  identification  and  control  design.  The  important  difference  with  the  current 
interactive  CACSD  programs  in  terms  of  user  interface  is  that  XMATH  incorporates  an  interactive 
X-windows  based  GUI  development  tool.  This  makes  it  possible  to  efficiently  design  interactive 
mouse-driven  application  software  where  the  interaction  takes  place  through  one  or  more  specially 
suited  windows  for  each  of  these  tasks.  Such  windows  display  all  relevant  parameters,  as  well  as 
graphical  output  like  frequency  response  plots  and  bar  graphs  of  error  norms  as  a  function  of  model 
order.  Standard  validation  and  identification  options  are  activated  by  a  pulldown  menu  with  on-line 
help,  and  all  displayed  parameters  are  open  to  be  changed  for  quick  recomputation  of  the  results. 

As  an  example,  consider  the  window  displayed  in  Figure  3.2  which  was  written  in  XMATH/GUI 
and  which  is  intended  for  interactive  system  identification.  This  tool  allows  the  user  to  identify  all 
ARX  models  up  to  a  certain  order,  view  their  frequency  response  and  confidence  intervals,  and  vary 
the  data  window  (gray  area  in  the  data  plot  area)  and  model  order  (gray  bar  in  the  two  upper  right 
error  norm  plots)  using  the  mouse  only.  In  the  lower  left  area,  all  important  model  parameters  are 
displayed  and  various  options  can  be  accessed  by  activating  a  pulldown  menu  from  the  top  menu 
bar. 

Clearly  the  XMATII-GUI  can  be  used  for  the  development  of  an  interactive  object  oriented 
environment  which  is  sensible  for  a  wide  variety  of  users  in  the  field  of  system  identification  and 
control  system  design.  This  not  only  relieves  the  user  of  the  burden  of  command  ^niaA’s  but  also 
makes  the  design  procedure  completely  self-explanatory. 
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On  the  Character  of  Uncertainty  for  System 
Identification  and  Robust  Control  Design  * 


Robert  L.  Kosutt 
September  15,  1992 


“It  ain’t  the  things  you  don’t  know  what  gets  you  in  deep  trouble.  It’s  the  things 
you  know  for  sure,  but  what  ain’t  so.”  -Uncle  Remus. 

Nothing  could  more  aptly  describe  the  predicament  when  faced  with  the  problem  of 
designing  a  controller  from  accumulated  sensed  input-output  data.  The  identification,  or 
estimation,  of  a  system’s  transfer  function  from  input-output  data  has  a  long  history  and 
there  are  many  excellent  survey  articles  and  textbooks  that  can  be  referenced,  e.g.,  [4],  [8], 
[7], [15],  [14].  The  problem  with  all  the  methods  discussed  in  these  references,  insofar  as  robust 
control  design  is  concerned,  is  that  model  error  estimates  are  usually  not  available,  and  if 
available,  cannot  be  trusted.  The  principal  reason  for  this  difficulty  is  that  the  identification 
algorithms  are  developed  under  the  false  assumption  that  the  true  system  is  in  the  model 
set.  As  a  result,  the  model  estimate,  loosely  speaking,  is  “biased”,  and  hence,  a  controller 
designed  using  the  estimate  may  result  in  unacceptable  closed-loop  behavior,  a  phenomenon 
which  is  well  documented,  e.g.,  [14,  3,  1].  To  paraphrase  the  above  aphorism,  “Trouble  is 
bound  to  follow  if  the  identified  model  is  known  for  sure  to  be  the  true  system.” 

This  situation  is  unfortunate,  because  all  the  standing  assumptions  made  in  current 
robust  control  design  methods  require  a  model  set  description  which  typically  consists  of 
a  nominal  model  and  an  error  estimate,  usually  a  norm  bound,  where  both  together  are 
guaranteed  to  encompass  the  true  system.  To  fulfill  the  needs  of  robust  control  design  will 
therefore  require  a  new  approach  to  system  identification  which  provides  both  a  nominal 
model  and  a  measure  of  its  uncertainty.  Such  schemes  have  been  referred  to  by  various 
names,  e.g.,  set-membership  identification,  set-estimation,  uncertainty  modeling,  as  well  as 
other  self-canceling  phrases  --  how  does  one  model  an  uncertainty?  This  research  topic  has 
received  strong  interest  recently  as  evidenced  by  this  workshop,  the  recent  special  issue  [10], 
and  the  many  conference  sessions  planned  at  the  next  ACC  and  CDC. 

*An  essay  for  the  NSF/AFOSR  sponsered  Workshop  on  “The  Modeling  of  Uncertainty  in  Control  Sys¬ 
tems,"  University  of  California,  Santa  Barbara,  June  18-20,  1992. 

t  Integrated  Systems,  Inc.,  3260  Jay  St.,  Santa  Clara,  CA,  95054  and  Department  of  Electrical  Engineering, 
Stanford  University,  Stanford,  CA. 
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Formulating  the  Problem 


“If  what  is  said  is  not  meant,  then  what  ought  to  be  done ,  remains  undone.” 
-  Confucious. 


Sometimes  solving  a  problem  means  finding  a  simple  or  direct  statement  of  the  problem 
in  the  first  place.  In  attempting  to  distill  the  problem  formulation  to  its  essence,  perhaps  it  is 
this:  given  a  finite  collection  of  sensed  sampled  input/output  data  from  an  unknown  system, 
what  level  of  confidence  can  be  assigned  to  a  feedback  controller  design  or  modification?  If, 
other  than  the  measured  data,  there  is  no  additional  knowledge  about  the  system,  then  the 
problem  is  solved:  there  is  no  safe  controller.  Anything  can  happen,  because  there  is  no 
means  for  inferring  the  future  from  the  past.  Therefore,  to  make  the  problem  meaningful, 
it  is  necessary  to  make  a  priori  assumptions  about  the  system.  These  assumptions  can 
be  either  qualitative  or  quantitative.  For  example,  assuming  that  the  unknown  system  is 
linear-time-invariant  is  qualitative  a  priori  knowledge.  Knowing  that  it  is  stable  can  still 
be  classified  as  qualitative,  but  assigning  a  region  for  nole  locations  or  knowing  a  bound  on 
the  impulse  response  is  quantitative.  A  similar  classification  can  be  made  regarding  signal 
charateristics.  Knowing  that  a  signal  is  white  is  qualitative;  but  knowing  a  precise  value  for 
the  variance  is  quantitative. 

Although  a  priori  quantitative  information  may  be  readily  available,  e.g.,  from  the  un¬ 
derlying  physics,  I  think  that  it  is  first  necessary  to  resolve  the  more  pristine  problem  of 
specifying  a  minimal  amount  of  qualitative  a  priori  data  so  as  to  assign  a  high  degree  of 
confidence  to  a  controller  design. 


Is  Nature  Good,  Evil,  or  Indifferent? 

The  phrase  “high  degree  of  confidence”  needs  clarification.  Do  we  mean  worst-case  or 
high  probability? 

Current  robust  control  formats  are  based  on  worst-case  scenarios.  Nature  is  perceived  as 
Evil,  and  hence,  does  the  wrong  thing,  from  our  perspective.  However,  if  this  is  not  the  case, 
and  Nature  is  at  worst  Indifferent  or  Neutral,  then  the  problem  should  be  posed  in  reverse:  to 
fulfill  the  needs  of  system  identification,  long  resting  on  a  probabilistic  (neutral)  foundation, 
may  require  a  new  approach  to  robust  control  which  allows  for  a  probabilistic  description 
of  uncertainty!  This  latter  possibility  invokes  the  current  debate  on  the  intrinsic  nature  or 
character  of  the  uncertainty  set.  Is  it  probabilistic  or  worst-case  deterministic?  Clearly  both 
can  be  used  to  quantify  uncertainty  in  either  disturbances  and  transfer  functions.  However, 
searching  for  the  worst-case  may  be  a  hopeless  task.  If  the  worst-case  has  not  yet  occurred, 
it  might  in  the  future,  and  hence,  the  search  never  ends.  Fitting  a  probablistic  model  is 
more  sensible  in  this  regard,  but  a  99.99%  confidence  level  does  not  preclude  the  remaing 
.01%  from  occurring. 

A  probabilistic,  or  stochastic,  description  of  a  disturbance  is  common  practice  and  forms 
the  basis  for  ^-filtering  and  control  design,  i.e.,  optimal  fdtcring  and  LQG  control  design 
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[2].  A  power  bounded  set  of  disturbances  and/or  a  worst-case  deterministic  description  of 
transfer  function  uncertainty  leads  to  T~La 0  methods  of  control  design,  e.g.,  [5].  These  sets 
can  be  combined  leading  to  mixed  'Hil'H00  control  design,  e.g.,  [9].  The  above  examples  by 
no  means  exhaust  the  possible  deterministic  and  probabilistic  sets.  For  example,  sequences 
can  be  uncertain  but  have  a  bounded  spectrum  or  a  bounded  magnitude.  Transfer  functions 
can  be  uncertain  but  with  (time)  bounded  impulse  responses,  and  so  on.  The  choice  of 
which  uncertainty  characterization  to  use  depends  upon  prior  knowledge  about  the  true 
system.  Clearly  different  assumptions  ought  to  lead  to  set  estimators  with  differing  forms 
and  mixtures  of  probabilistic  and/or  worst-case  deterministic  uncertainty  types. 

As  a  case  in  point,  if  we  begin  with  a  stochastic  description  of  the  exogenous  inputs 
to  a  system,  then  the  standard  least-squares  based  identification  method  with  a  high-order 
ARX  model  structure  leads  naturally  to  a  purely  parametric  uncertainty  which,  depending 
on  further  assumptions,  is  either  probabilistic  (normally  distributed)  or  worst-case  determin¬ 
istic  (ellipsoid  bounded),  e.g {12,  11,  6].  To  conform  to  current  robust  control  paradigms, 
the  parametric  characterization  of  uncertainty  must  be  transformed  to  a  non-parametric 
worst-case  deterministic  frequency  domain  bound,  a  transformation  that  is  not  without  a 
considerable  loss  of  information.  Dealing  directly  with  the  worst-case  deterministic  (ellipsoid 
bounded)  parameter  uncertainty  leads  to  some  new  insights  into  robust  control  design  e.g., 
[13].  For  the  probabilistic  form  of  parameter  uncertainty,  it  is  my  view  that  it  would  be 
better  to  develop  a  compatible  theory  of  “probabilistic”  robust  control. 

Going  in  this  direction,  however,  immediately  raises  the  question:  what  does  a  robust 
control  mean  in  the  context  of  probabilities?  We  tend  to  think  of  a  robust  controller  as  pro¬ 
viding  an  absolute  guaranty  against  instability  and/or  certain  levels  of  performance  degra¬ 
dation  given  a  deterministic,  or  “hard  ”  bound  on  plant  uncertainty.  With  a  probabilistic 
description,  or  “soft”  bound,  we  must  decide  if  99.99%  is  safe  enough.  To  turn  the  ques¬ 
tion  the  other  way,  the  deterministic  bounds  necessitates  guarding  against  the  worst-case, 
which  may  be  extreme,  i.e.,  unlikely,  thereby  leading  to  an  conservative  controller.  But  this 
brings  us  back  to  exactly  the  question  of  probabilities  and  outcomes,  and  finally  to  a  more 
fundamental  question:  is  Nature  neutral  or  conspiratorial? 


Towards  a  New  Paradigm,  or  Paradigm  Lost 


Attempting  an  answer  may  not  be  necessary,  nor  very  fruitful.  I  think  that  a  better  atti¬ 
tude  at  this  point  is  to  follow  the  consequences,  without  prejudice,  of  developing  a  theory  of 
set-membership  identification  and  corresponding  “robust”  control  design  methods  compati¬ 
ble  with  probabilistic  plant  set  descriptions.  This  to  me  seems  the  more  sensible  engineering 
oriented  character  of  uncertainty. 

Hopefully,  as  a  result  of  research  efforts  in  many  different  directions,  new  paradigms 
will  arise  which  combine  system  identification  and  robust  control  design.  With  the  wide 
availability  and  use  of  CACSD  packages,  the  benefits  of  this  research  could  be  widely  utilized 
in  many  engineering  fields.  Hence,  it  becomes  imperative  that  the  resulting  methodologies  are 
comprehensible  and  useful  for  the  engineering  community  at  large;  not  just  understandable 


to  a  few  experts.  The  onus  is  on  us! 
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Abstract  —  A  method  is  presented  for  parameter  set  estimation 
where  the  system  model  is  assumed  to  contain  both  parametric 
and  nonparametric  uncertainty.  In  the  disturbance-free  case,  the 
parameter  set  estimate  is  guaranteed  to  contain  the  parameter 
set  of  the  true  plant.  In  the  presence  of  stochastic  disturbances, 
the  parameter  set  estimate  obtained  from  finite  data  records  is 
.  hown  to  have  the  property  that  it  contains  the  true-plant 
parameter  set  with  probability  one  as  the  data  length  tends  to 
infinity. 

I.  Introduction 

IN  the  traditional  adaptive  control  system,  the  identified 
model  is  used  for  on-line  controller  design  without  any 
regard  for  errors  between  this  model  and  the  true  system 
which  generated  the  data.  The  identified  model  is  usually 
selected  out  of  a  model  set  with  unknown  parameters  as 
depicted  in  Fig.  1.  The  controller  is  designed  as  if  the 
parameter  estimates  were  in  fact  the  correct  parameters  for 
describing  the  plant.  This  is  known  as  applying  the  certainty 
equivalence  principle.  In  the  ideal  case,  it  is  assumed  that 
there  exist  parameters,  which  if  known,  would  precisely 
account  for  the  measured  data.  Even  in  this  ideal  case,  the 
transient  errors  between  the  identified  model  and  the  true 
system  can  be  so  large  as  to  completely  disrupt  the  perfor¬ 
mance.  In  the  usual  (nonideal)  case,  the  true  system  is  not  in 
the  model  set,  therefore,  both  unacceptable  transient  or 
asymptotic  behavior  can  occur,  e.g.,  (1  j. 

Following  the  ancient  Greek  adage,1  “  Well  begun,  half 
done,”  one  ought  to  construct,  at  the  outset,  an  adaptive 
control  system  which  specifically  accounts  for  the  inevitable 
model  error,  i.e.,  an  adaptive  robust  control.  Depicted  in 
Fig.  2  is  our  proposed  scheme  where  the  traditional  parame¬ 
ter  estimator  is  replaced  with  an  estimator  that  produces  a 
model  set.  Thus,  point  estimation  of  a  single  model  is 
replaced  with  set-membership  identification.  The  estimated 
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Estimated  Parameters 


Fig.  ! .  Traditional  adaptive  control  system  with  parameter  estimator. 


Model  Set 


Fig.  2.  Adaptive  control  with  set  estimator. 


model  set  can  contain  both  parametric  and  nonparametric 
descriptions  of  uncertainty  arrived  at  from  both  measured  and 
prior  data. 

We  also  replace  the  traditional  controller  design  algorithm 
with  a  robust  controller  design  algorithm  which  accepts  the 
model  set  format.  By  referring  to  a  robust  controller  we 
mean  a  controller  that  achieves  some  specific  set  of  specifica¬ 
tions  for  any  plant  model  in  the  model  set.  The  robust 
controller  design  thus  takes  a  set  of  models  as  input  and 
produces  a  controller  that  is  guaranteed  to  meet  the  specifica¬ 
tions  for  all  models  in  this  set.  The  robust  controller  design 
can  also  report  the  worst-case  performance  with  respect  to 
the  model  set.  It  is  also  true  that  if  the  model  set  is  too  large, 
or  the  specifications  arc  too  tight,  then  no  robust  controller 
will  exist. 

During  the  transient  or  learning  phase,  the  estimated  model 
set  could  be  a  poor  representation  of  the  true  system  as  it 
could  be  quite  large.  However,  if  the  system  which  generated 
the  measured  data  is  contained  in  the  estimated  set,  the  robust 
controller  will  be  stabilizing,  though  may  be  of  low  author- 
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ity.  Conversely,  if  the  model  set  becomes  smaller  after  some 
time,  this  will  be  rellectcd  in  a  higher  authority  controller 
with  more  desirable  performance  characteristics. 

It  is  important  to  point  out,  and  even  emphasize,  that 
although  this  approach  is  inspired  by  a  separation  principle, 
it  is  not  optimal.  Roughly  speaking,  set  estimation  and 
robust  controller  design  might  benefit  from  being  coupled. 
For  example,  the  input  u  might  be  temporarily  manipulated 
in  such  a  way  so  that  the  set  estimator  could  rapidly  learn  and 
therefore  improve  future  performance  at  the  expense  of  cur¬ 
rent  performance.  In  a  purely  Bayesian  framework,  notions 
of  optimality  along  this  line  are  made  precise  in  [9], 

Although  not  guaranteed  to  be  optimal,  the  scheme  shown 
in  Fig.  2  is  at  least  less  heuristic  than  the  traditional  scheme 
of  Fig.  1.  For  example,  if  the  set  estimator  is  consistent, 
that  is,  the  true  plant  is  in  the  estimated  model  set,  and 
moreover,  if  we  stop  adapting  at  any  given  point,  then  we  are 
guaranteed  a  woist-case  performance  as  reported  by  the 
robust  controller  design. 

In  this  paper,  we  address  the  problem  of  parameter  set 
estimation  where  the  system  model  contains  both  parametric 
and  nonparametric  uncertainty.  In  our  formulation,  we  use 
the  measured  data  to  delineate  a  parametric  set  which  ac¬ 
counts  for  a  priori  knowledge  of  nonparametric  dynamics 
and  disturbances.  Observe  that  if  measured  data  is  not  used, 
then  the  identified  model  set  consists  of  a  constant  model  set 
and  the  “adaptive"  controller  reduces  to  a  single  robust 
design.  We  can  also  recover  the  traditional  adaptive  scheme 
by  replacing  the  robust  design  with  a  heuristic  design  which 
uses  a  typical  model  in  the  set,  e.g.,  the  “center”  or 
“average"  model. 

We  will  not  address  the  robust  control  design  issues  as 
different  methodologies  for  robust  control  design,  particu¬ 
larly  for  plants  with  uncertain  nonparametric  linear  dynam¬ 
ics,  can  be  found  in  [26],  [8],  and  [12],  Methods  for  robust 
control  design  of  plants  with  parametric  uncertainty  are  de¬ 
scribed  in  [2],  [5]  and  the  references  therein.  In  the  case  of 
parametric  set-membership  uncertainty,  minimax  controllers 
are  considered  in  [22]  and  [21]. 

At  present,  there  are  several  competing  and  complemen¬ 
tary  methodologies  for  the  design  of  set  estimators,  e.g., 
[29],  [20],  [17],  [14],  [18],  and  [32],  Related  work  on  the 
limitations  of  identification  of  linear-time  invariant  systems 
can  be  found  in  [13],  [15],  [24],  and  [28],  Our  work  here 
follows  closely  to  that  described  in  [31],  [32],  and  [18]  for 
the  disturbance-free  case  with  nonparametric  uncertainty,  and 
in  [23]  for  the  disturbance  case.  The  parameter  sets  devel¬ 
oped  here  are  similar  in  form  to  those  developed  in  [10], 
[11],  [25],  and  [3]  for  the  case  with  no  nonparametric 
uncertainty  but  with  bounded  disturbances.  The  foundation 
and  impetus  for  much  of  the  work  in  parameter  set-member- 
ship  identification  can  be  traced  back  to  [27],  and  [4]  for  the 
state-estimation  problem. 

The  paper  is  organized  as  follows.  After  introducing  some 
notation  and  standard  definitions  in  the  next  section,  the 
problem  is  formulated  in  Section  III.  Parameter  set  estimates 
for  the  disturbance-free  equation-error  case  arc  developed  in 
Section  IV.  In  the  presence  of  stochastic  disturbances,  equa¬ 


tion-error  parameter  set  estimate:,  computable  from  (nine  data 
records  are  presented  in  Section  V.  Intensions  to  the  output 
error  case  and  deterministic  distui  fiances  are  discussed  in 
Section  VI.  The  paper  concludes  with  some  remarks  in 
Section  VII. 


II  No  I  A'!  ION  AN  |J  I’KI  1  IMINARII.S 

Transfer  Functions:  In  this  paper,  we  consider  sampled- 
data  systems  with  transfer  functions  in  the  complex  variable 
Z .  If  the  system  is  denoted  by  G.  then  its  transfer  function  is 
denoted  by  G(z)  Typically.  G(z)  is  obtained  as  the  zero- 
order  hold  equivalent  of  a  continuous-time  transfer  function 
P(s).  Thus, 


G(z)  =  r?TOr{/>(j)} 

»  (i  -z-')irj^/»(5)} 


(1) 

(2) 


where  and  2?{- }  denote  the  zero-order  hold  and 

the  usual  z-transform  operations,  respectively. 

A  transfer  function  G(z)  is  stable  if  all  its  poles  are 
strictly  inside  the  unit  circle  |  z  |  =  1 .  The  frequency  re¬ 
sponse  of  G(z)  is  the  function  G(e'“)  restricted  to  the 
domain  |  u  |  <  v ,  where  w  is  the  frequency  variable  nor¬ 
malized  with  respect  to  the  sampling  frequency.  For  a  stable 
transfer  function  G(z),  the  and  Jf2  norms  are  defined 
as 

II G ||  ^  =  sup  I  G(ey")  |  (3) 

|  u|  Si 
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Sequences:  A  sequence  x  is  a  function  of  discrete-time 
points,  i.e.,  at:  33  -  if] p  where  S3  =  { 1 , 2.  •  •  •  }  is  the  set  of 
positive  integers.  We  write  x(0  to  mean  the  value  of  the 
sequence  at  a  particular  lime  t,  normalized  with  respect  to 
the  sampling  interval  Hence,  time  takes  on  integer  values 
with  initial  time  defined  as  t  =  1 

Following  [24],  a  sequence  x  is  quasi-stalionary  if 
<?(*(/))  is  bounded  for  all  t  and  its  autocorrelation 


rxxiT)  ~  l»m  —  E  *(*(')*('  -  ’•))  (5) 

iV— <*>  /V  | 


exists  for  all  integers  r,  where  <?(•)  denotes  the  expectation 
operator.  If  at  is  a  deterministic  sequence,  the  expectation  is 
without  effect  and  quasi-stationary  then  means  that  at  is  a 
bounded  sequence  such  that  the  limits 

1  * 

r*,(0  =  lim  —  53  x(r)x{t  -  r)  (6) 

,V  — OD  /V  ,  _  | 

exist.  For  easy  notation,  we  introduce  the  symbol  f  by 

<?(.v)  =  lim  --  53  <f(  A-(0)  (7) 

The  power  spectrum  of  at  is  defined  as 
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This  leads  to  the  power  in  x  given  by 


r(0)  =  7“  /  S*x(“)du3  =  lim  -7.  E  ^(->f(02)' 

Z  X  J  _  w  A/—  OO  /V  f  | 


that  both  y  and  u  have  finite  power,  that  is,  rvi,(0)  <  co  and 

ruJO)  <°°2 


A.  Model  Set  Assumptions 
(9)  The  model  set  4/  is  defined  as  follows: 


Similar  definitions  apply  to  the  cross  spectrum  Sxy(ui)  of  the 
sequences  x  and  y. 

The  sample-mean  operator  <?*(•)  is  defined  to  be 

(»0) 

A  /=  t 

We  use  ||  x  ||  k2  to  denote  the  truncated  /2-norm  of  a  sequence 


11*11*2  =  E  *(0 


hence, 

**(**)"  £t*l«-  (12) 

Linear  Operators:  The  notation  Gx  means  the  sequence 
obtained  when  the  system  G  operates  on  the  sequence  x.  We 
write  ( Gx){t )  to  mean  the  value  at  time  t  of  the  sequence 
Gx. 

When  we  say  that  G  is  a  linear-time  invariant  system,  we 
mean  that  Gx  is  the  convolution  operation 

(Gx)(f)  =  53  g(k)x(t  -  k)  (13) 

Ar  =  0 

where  the  sequence  g  is  the  pulse  response  of  G.  Thus,  G 
has  the  transfer  function 

G{z)  =  E  g{k)z'k.  (14) 

*  =o 

The  above  definition  restricts  the  sequence  Gx  to  /  >  1. 
Hence,  the  system  G  can  be  regarded  as  having  no  memory 
of  events  prior  to  t  =  1,  the  initial  time.  Roughly,  this  means 
all  initial  conditions  are  zero. 

To  reduce  notation,  we  use  the  transform  variable  z  to 
denote  the  shift  operator,  so  zkx(t)  =  x(t  +  k),  z ~kx(t) 
=  x(t  -  k),  and  zkx  shifts  each  member  of  the  sequence  x. 

III.  Problem  Formulation 
The  problem  is  to  use  the  measured  sampled  data 

{y,  u:  t  =  1  ,*  •  ■ ,  N)  (15) 

to  identify  a  model  set  suitable  for  robust  control  design.  The 
system  which  produced  the  data  is  assumed  to  be  a  linear¬ 
time  invariant  system  of  the  form 

y  -  Gu  +  v  (16) 


.41  L  { y  =  Gu  +  v  Ge  .  ue  r  }  (17) 

where  'fi  is  the  set  of  linear-time  invariant  systems  and  t  is 
the  set  of  disturbances.  It  is  assumed  that  the  true  system  (16) 
is  a  member  of  the  model  set  //.  The  reader  should  be 
cautioned  that  G  defined  in  the  model  set  .M  is  not  the 
same  as  G  in  (16).  To  avoid  adding  more  subscripts  G{nic, 
etc.,  unless  otherwise  stated  as  part  of  some  set,  e.g., 
G  e  the  symbols  G,  y.  u,  and  v  refer  to  the  true  system 
(16). 

We  first  concentrate  on  the  disturbance-free  case,  i.e., 
v  =  0,  in  the  next  section.  The  disturbance  set  1'  is  dis¬ 
cussed  later  in  Section  V. 

The  set  of  linear-time  invariant  systems  is  defined  by 

{0,(1  +  AcJFc):ffe0pnor,  ||AC||^<  1}  (18) 

where  Ge(z)  is  a  parametric  transfer  function  with  parame¬ 
ters  0  e©prio,,  referred  to  as  the  prior  parameter  set.  The 
system  tsGWc  is  referred  to  as  the  multiplicative  nonpara- 
metric  uncertainly.  It  is  a  dynamic  uncertainty  characterized 
by  an  uncertain  but  unity  bounded  stable-transfer  function 
Ac(z)  and  a  known  stable-transfer  function  WG(z).  Note 
that  Wc(z)  acts  as  a  frequency  weighting  function,  whose 
frequency  response  magnitude  |  Wc(e'u)  |  reflects  the  size  of 
the  nonparametric  uncertainty.  Since  a  parametric  model  of  a 
system  is  never  complete  unless  wc  have  some  idea  on  its 
limitations  and  accuracies,  we  assume  that  the  uncertainty 
weighting  function  Wc(z)  is  known.  Having  knowledge  of 
is  precisely  the  assumption  made  in  robust  control 
design,  e.g.,  [8],  However,  the  center  of  the  model  set  is 
fixed  in  robust  control  design,  here  it  is  parametric,  i.e.,  Ge 

Suppose  the  true  system  G  is  in  ’$  and  we  are  interested 
in  all  the  possible  representations  of  G  in  'fi .  Solving  for  Ac 
in  (18)  in  terms  of  G  and  0,  we  get 


Wc  define 


0*  =  (8 


G  -  G„ 


and  refer  to  0*  as  the  parametric  limit  set  because  it  docs 
not  depend  on  the  data  set  but  rather  on  the  true  but  unknown 
system  G.  As  a  result,  0*  D  0pnor  is  the  set  of  all  possible 
parameter  values  consistent  with  the  assumption  that  the  true 
system  G  is  in  '6.  Consequently,  it  is  not  possible  to 
consider  a  “true"  parameter  value  because  any  member  of 


where  G  is  a  linear-time  invariant  system  with  transfer  func-  1  Input  and  output  sequences  with  tmuc  power  occur,  for  example,  when 
tion  G(z).  u  is  an  applied  input,  y  is  the  measured  output.  G  ,ls,  aml  "  ha,i  ,in"c  w,’cn  f;-  nci"san^  M-lhlc  •' 

r  1  Mamli/cu  r»v  an  appropriate  (ccoh.u  k  and  the  cxorc neons  inputs  to  the 

and  v  is  a  disturbance  as  seen  ai  the  output,  it  is  also  assumed  feedback  system  have  finite  power 
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0*  D  0pnor  is  a  possibility  since  the  decomposition  of  G 
into  Ge  and  Ac  is  not  unique.  Thus,  the  goal  is  to  obtain  an 
estimate  of  the  set  0*  from  the  measured  data. 

Throughout  the  remainder  of  the  paper  we  further  charac¬ 
terize  the  parametric  transfer  function  G„(  z)  by  using  the 
standard  ARX  form  [24] 

GP(z)  =  £a(z)M»(z) 

B9(z)  =  b{z~'  +  ••• 

A«{z )  =  1  +  a.z"1  +  •••  +a„z~n 

6  =  [a,  anb{  •••  bm}T.  (21) 

Thus,  the  parameters  are  the  coefficients  in  the  parametric 
transfer  function.  With  this  parametrization,  the  limit  set 
becomes 


0*  = 


AeG  -  B, 
WcBe  JC 


(22) 


The  problem  we  are  addressing  in  this  paper  is  to  find  an 
estimate  of  0*.  We  should  also  point  out  that  other  than 
what  is  assumed  for  the  transfer  function  Ac(z),  we  do  not 
estimate  it  from  the  data.  We  first  give  an  example  of  0*, 
and  then  in  the  next  section,  describe  a  set  estimator  in  the 
disturbance-free  case. 


B.  Example  of  Limit  Set 
Suppose  that  the  true  transfer  function  is 


(/  10  \ 

(t+t) 


about  10  rad/s  where  the  magnitude  of  Wa  ,  is  greater  than 
100.  Wc  i  is  essentially  the  same  but  has  a  zero  dc  gain 
Shown  in  Fig.  3  are  the  frequency  response  magnitudes  and 
the  multiplicative  error  with  respect  to  a  '‘nominal"  paramet¬ 
ric  transfer  function 

I  10 

-+-j  (27) 


This  transfer  function  can  be  viewed  as  an  approximation  of 
G(z)  obtained  by  neglecting  the  resonance  in  (23).  Remem¬ 
ber,  there  is  no  true  parameter  value,  rather,  there  is  a  true 
set  0*,  one  element  of  which  is  this  nominal  parameter 
value. 

Points  in  the  limit  set  corresponding  to  the  above  weights 
are  shown  in  Fig.  4.  These  points  are  obtained  by  testing  0  in 
(19)  over  a  set  of  points.  If  a  point’s  corresponding  Ac 
satisfies  llAcllj^  £  1,  then  it  belongs  to  0*.  Since 
WG  2(eJu)  is  zero  at  w  =  0,  i.e.,  the  dc  gain  of  G(z)  is 
assumed  known,  and  the  two  parameters  in  6  are  constrained 
to  lie  on  a  line  in  the  parameter  space.  The  line  becomes 
“blurred”  in  the  limit  set  corresponding  to  WG  ,  because 
there  is  no  frequency  where  the  frequency  response  of  Wc  , 
is  identically  zero. 


IV.  Disturbance-Free  Equation-Error  Set 
Estimation 

In  the  disturbance-free  case,  we  have  v  =  0.  Thus,  the 
model  set  in  (17)  reduces  to 

{y  s  Gu:  Ge  #}  (29) 


The  sampling  frequency  is  chosen  to  be  2ir(10)  rad/s  or  10 
Hz.  Observe  that  the  system  has  a  simple  pole  at  I  rad/s,  and 
a  very  lightly  damped  resonance  at  10  rad/s.  Suppose  we  are 
interested  in  obtaining  a  good  low-frequency  model  by  ne¬ 
glecting  the  resonance,  but  accounting  for  it  as  one  realiza¬ 
tion  of  some  nonparametric  dynamics.  Thus,  select  the  para¬ 
metric  transfer  function  as 

<24> 

Consider  the  following  weights: 

W'c.t(z)  =65  (25) 

K'c.zU)  =  ^c.i(z)  -65(^)  •  (26) 

Either  of  these  weights  can  account  for  the  resonance,  but 
they  reflect  different  prior  low-frequency  uncertainties.  The 
weight  ,  reflects  a  low-frequency  multiplicative  uncer¬ 
tainty  of  about  10%  where  it  has  a  dc  gain  of  about  0.1,  and 
it  anticipates  a  rather  large  resonance  at  frequencies  beyond 


with  '$  given  by  (18). 

Theorem  I:  Suppose  the  measured  data  { y,  u:  t  = 
1,-  •  A/}  is  generated  from  y  =  Gu  with  Ge  Sf.  Then  the 

following  holds: 

e*c0jAf]c0t,  vArefl,  A/],  vNeW  (30) 
where  0[A/]  and  ©t.  are  given  by 

0*  -  {e:\\A9y-  Bsu\\k2<\\WcB9u\\k2}  (31) 

©(*]=rW  (32) 

k  =  I 

Remarks:  We  refer  to  Qk  or  0[(V]  as  equation-error 
parameter  sets  because  the  equation-error  term  Aty  -  Btu 
appears  in  the  definition  [24].  Observe  that  the  equation-error 
sets  depend  only  on  the  measured  data  and  the  known  bound¬ 
ing  transfer  function  H^fz).  Because  0*  is  a  subset,  it 
follows  that  0*  for  any  k  e  [l,  N]  or  0(  N]  is  an  estimate 
of  0*.  These  sets  arc  easy  to  compute  as  will  be  shown  in 
Section  1V-C.  First  wc  prove  the  theorem. 

Proof:  First,  recall  the  following  fact  from  [7],  If  T  is 
a  stable  linear-time  invariant  operator  with  transfer  function 
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a 

Ml 

« 

s 


Frequency  (rad/sec) 

Fig.  3.  Frequency  response  magnitudes  of  WG  |t  WG  2,  and  (C 
G#  )/G*  . 


with  IIA^H^  <  1.  Note  that  0*  and  A*-  must  agree  with  the 
measured  data,  so 

A,.y  -  Be.u  =  \*cWaBe.u  (39) 

Taking  the  /2-norm,  we  have 

II  Ae.y  —  =  II A  f;  lV(i  Bt.u\\  .  (40) 

Since  HA^H^  £  1,  (36)  implies  that  0*  must  satisfy 

II  A,.y  -  Be.u\\k2  <  \\WGB9.u\\k2-  (41) 

Therefore, 


\\Aey- Btu\\k2<  ||  ^cfl#u||  =  6*  (42) 

for  0*  c  Qk.  From  this,  it  follows  immediately  that  0*  c 
0[  A/].  □ 

A.  Frequency-Domain  Expressions 

Define  the  asymptotic  equation-error  set  as 

©„  =  lim  Qk.  (43) 

k  —  ob 

The  limit  set  0*  and  the  asymptotic  equation-error  set  ©„ 
are  expressed  in  the  frequency  domain  in  the  following 
theorem. 

Theorem  2: 

i)  The  limit  set  has  the  following  decomposition: 


where 


©sub  =  {8: 


0.4 1 - 

-1.05 

-1  -0.95  -0.9  0  85  -0.8 

-0.75 

Fig.  4. 

parameter  a 

Limit  sets  0*  for  Wc  ,  and  WG  2. 

T(z),  then 

II  ^  II  jC  = 

sup  |  T(eJU)  | 

|«|Slr 

(33) 

= 

II  7*11*2 

sup  - — — 

||*||«*0  II  X  11*2 

*eW 

(34) 

= 

sup  {7:  II  7>||*2  <  711*11*2- 

V  ||  at||  *.2  <  oo,  v/cePfl). 

(35) 

As  a  direct  consequence,  we  also  have 

sup  ||7*||*2  =  7IML2- 

imi^Y 

(36) 

To  show  that 

0*C0A,  v/relil 

(37) 

let  0*  e  0",  i.c 

G  Bf(\  +  KW") 

An. 

(3R) 

©ru6  n 

(44) 

AgG  -  Bs  j 

-  stable) 

WCBe  ) 

(45) 

e'-)  -  B9(e'“)  | 

iu)Be(e'“)  |.  V|«|  <  *}. 

(46) 

ii)  If  y  =  Gu  and  u  has  spectrum  Suu( w),  then 


■I  •T.ij 


AeG  -  B„\  2 


WcB,\2)Suud*<  0  .  (47) 


Proof:  The  decomposition  of  0*  follows  directly  from 
the  definition  of  the  norm.  The  asymptotic  set  descrip¬ 
tion  is  a  direct  application  of  the  spectral  expressions  in 
(9).  □ 

Theorem  1  states  that  0*  c  0*  for  all  k.  It  is  clear  from 
the  frequency-domain  expression  for  0*  that  0*  c  ©^  also 
because  0  e  0*  implies  that  the  integrand  in  the  frequency- 
domain  expression  for  is  negative.  Note  also  that  the 
definition  of  0*  describes  a  parameter  set  via  an  norm. 
By  comparison,  0„  is  described  via  an  J?2  norm  when  u  is 
white  noise  with  Suu(w)  =  1,  i.e.. 


0  =  [0 


AjG  -  g,||„ 

II 


^  <  1  . 
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B.  Use  of  Data  Filtering 

The  effect  of  data  filtering  is  to  replace  (y,  u)  with 
( Fy ,  Fu),  where  F  is  a  fitter  with  transfer  function  F(z). 
Hence 

e*  =  {e:  ||  AtFy  -  BeFu\\k2  <  \\  BeWcFu\\  k7) .  (49) 

The  effect  of  the  fitter  is  seen  more  dearly  in  the  frequency- 
domain  expression 

e.  =  f  {\asc  -  b\* 


to  the  ordinary  least-squares  estimate  wtien  Wa  -  0.  This 
occurs  only  when  nonparamctric  dynamics  are  neglected. 
Proof:  Using  the  definitions  in  the  theorem,  we  have 


Aey  -  Beu  =  >•  - 
WcB$u  =  0rW(\ 


0r<t> 

0 

4>u 


(61) 

(62) 


Hence,  substituting  into  (31),  we  have 


9*  =  Oeia^lljg-fl^L  < 


eTwr 


<t>U 


(63) 


-  l^niFlX^c^O  •  (50) 


The  filter  and  the  input  spectrum  form  the  frequency-depen¬ 
dent  weight  |  F(ej<J)  |  2Suu(a>)  which  also  appears  in  stan¬ 
dard  equation-error  minimization  methods  [24], 


C.  Computing  the  Equation-Error  Set 

Ideally,  it  is  desirable  to  compute  6[N],  This  involves 
intersecting  the  N  sets 


{e*:  k= 


We  start  with  the  following  result  which  presents  a  conve¬ 
nient  form  for  computing  0*. 

Theorem  3:  Define  the  following  vectors  whose  elements 
are  sequences: 


<t>u 


(51) 


-  [-*  ly  •••  -  z~"y\T  (52) 

*„  =  [z~'u  •••  z-mu]T.  (53) 

Then, 

i)  0*  can  be  expressed  in  the  quadratic  form 


=  {6:  errke  -  2 pie  +  «*  <  0}  (54) 


where  B teKp,  and  rjte^px/’  (with  p  =  m  +  n) 

are  given  by 


r*  =  ^*(<Mr) 


«*  =  <?*(y2) 

Bk  =  <?k(<t>y) 

0  0 

0  6k({Wc*.){Wa+y) 


(55) 

(56) 

(57) 


ii)  Provided  T*  1  exists,  another  expression  is 

et  =  {e:(«-^)rr*(0-^)<^J  (58) 


ek  =  Tk-%  (59) 

Vk  -  0%rk  ‘Bk  -  ak.  (60) 

iii)  All  the  eigenvalues  of  I\  are  real  and  some  of  them 
can  be  negative.  When  f\  >  0,  0t  is  an  ellipsoid  in  (a*. 
When  is  indefinite,  0*  is  an  hyperboloid  in  13'’. 

Remarks:  In  part  ii),  the  center  of  the  set  6k  is  identical 


Using  (12),  the  quadratic  form  of  0*  follows  immediately, 
which  proves  part  i). 

Part  ii)  is  obtained  by  direct  substitution  when  Tk  1  exists. 
To  prove  iii),  observe  that  T*  can  be  expressed  as  follows: 


where 


*,  ii 


f*.u  r*.i2 

^k.  12  1\.22 

(64) 

(65) 

=  ^*(0^1) 

(66) 

(67) 

The  22  matrix  subblock  can  obviously  cause  to  have 
negative  eigenvalues.  The  square  roots  of  the  eigenvalues  of 
I*-1  are  the  lengths  of  the  semiaxes  of  the  ellipsoid.  There¬ 
fore,  as  I\  becomes  singular,  some  directions  of  the  ellipsoid 
become  unbounded.  A  hyperboloid  results  when  one  or  more 
eigenvalues  of  become  negative.  □ 

Note  that  if  the  spectrum  of  u  is  concentrated  at  those 
frequencies  where  |W'c(e'“)|  is  large,  the  22  matrix 
subblock  can  have  negative  eigenvalues.  This  tends  to  make 
T*.  become  indefinite,  so  that  Ok  becomes  an  hyperboloid. 
This  will  be  illustrated  in  an  example  in  the  next  section. 

D,  Example  of  Ok 

The  true  system  was  selected,  as  in  the  previous  example 
in  Section  III-B  using  the  weight  Wc  ,  defined  in  (25).  The 
input  was  a  log-spaced  sinesweep  from  0.1  to  31  rad/s  over 
102.3  s,  thus,  N  =  1024  data  samples.  Two  filtered  data  sets 
were  generated  using  eighth-order  low-pass  Butterworth  fil¬ 
ters;  one  with  a  bandpass  of  oj  =  2  rad/s,  and  the  other  with 
oiy  =  1  rad/s. 

Fig.  5  shows  0IO24  processed  with  the  two  data  filters.  An 
hyperboloid  is  obtained  with  =  2  rad/s  and  an  ellipsoid 
with  (j>}  =  1  rad/s.  (Note  that  only  one  branch  of  the  hyper¬ 
boloid  is  shown  in  the  figure.)  This  confirms  the  earlier  point 
that  when  u  is  concentrated  at  those  frequencies  where 
I  Wc(eJv)  |  is  large,  0*  can  become  unbounded.  Points  in 
the  limit  set  0*  are  shown  and,  as  predicted  by  the  theory, 
are  all  contained  in  the  equation-error  sets. 

E.  Computing  Intersecting  Ellipsoids 

To  compute  (-)(  N )  requires  computing  the  intersection  of 
the  sets  { 0* :  k  =  1 ,  •  ■  • ,  N } .  Since  all  the  0t  arc  convex ,  it 
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Fig.  6.  The  equation-error  sets  {0*.  k  =  200.300.-  ■  \  1024}  using  the 
filtered  data  with  =  1  rad/s;  0*  is  also  shown. 

follows  that  0[7VJ  is  convex.  In  general,  it  is  not,  however, 
an  ellipsoid.  To  see  this,  we  plotted  some  of  the  bounding 
ellipsoid  sets  in  Fig.  6.  Specifically,  it  shows 

{9*:  k  =  200, 300, 1024} 

corresponding  to  the  previous  example  using  the  data  filter 
with  cutoff  at  I  rad/s.  Observe  that  the  intersection  of  the 
sets  produces  a  smaller  (convex)  set.  Several  approaches  are 
possible.  One  approach  is  to  compute  the  smallest  volume 
ellipsoid  that  contains  the  intersection  of  the  ellipsoids.  This 
is  discussed  in  [6]  and  (3]. 

F.  Effect  of  Initial  Conditions 

As  defined  in  Section  II,  the  sequence  Gu  evaluated  at 
time  t  e  S3  is  defined  by 

(Gu)(t)  =  £  g(r)u(t  -  r).  (68) 

r=  1 

To  account  for  initial  conditions,  let  u  denote  a  bounded 
input  applied  for  t  <  0.  Thus,  the  system  with  initial  condi¬ 


tions  can  be  expressed  as 

y  =  Gu  +  y  (69) 

with 

oo 

j'(f)  =  YL  g(T)u{t  -  t),  vrelfQ.  (70) 

r  -  I 

If  G  is  stable  or  is  in  a  stabilizing  feedback,  then  y(t)  -*  0 
exponentially  as  t  -*  oo.  Thus,  the  effect  of  initial  conditions 
dies  out  exponentially  fast,  or  slow,  depending  on  the  slowest 
modes  in  G  or  the  closed-loop  system.  Hence,  for  suffi¬ 
ciently  large  N,  we  have  QN  =  0„.  More  precisely,  for 
each  0eOm, 

lim  inf  jj0  -  =  0  (71) 

where  |[  •  ||  is  a  norm  on  (3 p  ■  In  words,  the  estimator  will 
eventually  report  possible  parameter  values  that  are  close  to 
the  asymptotic  set,  and  hence,  asymptotically  bound  the  limit 
set  9*  as  the  data  length  N  increases. 

Another  way  to  account  for  the  effect  of  initial  condition  is 
to  assume  bounds  on  u  and  the  tail  of  g 

EUW|s«,  (72) 

T  =  l 

|ff(0|£«„  f  —  0.  (73) 

Then  |  jz(/) |  <  k,<c2,  and  it  can  be  treated  as  a  bounded 
disturbance  in  (69),  see  e.g.,  [30]. 

G.  Other  Forms  of  Nonparametric  Uncertainty 

The  equation-error  sets  we  have  developed  so  far  assume  a 
multiplicative  form  of  nonparametric  uncertainty.  This  is  not 
a  necessary  restriction  as  they  could  also  have  been  devel¬ 
oped  for  other  forms.  The  requisite  modifications  are  shown 
below  for  some  other  typical  forms. 

Theorem  4: 
i)  Multiplicative:  If 


G=  -^-(1  +  *CWC),  ||AC||,_<1 

(74) 

then 

9*  =  {*:  \\Aey~  Btu\\kl^\\WGBtu\\k2\. 

(75) 

ii) 

Additive:  If 

6?  =  +  Ac  Wc ,  l|Ac||j^  <  1 

(76) 

then 

9*  -  [O’-  II  A9y  -  Beu\\kI  <  ||  WcAtu§kl\ . 

(77) 

iii) 

Inverse  Multiplicative:  If 

°  <4.  (  1+ 4„  Wc  )  •  i|Acl'--' 

(78) 

then 

9*  {0:  )|  A0y  -  <  ||  Wa  A ,  HI* , } 

(79) 
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iv)  Feedback:  If 


B, 


Be 

1  +  Ac  WG-~  i 


|AC||.^<  1  (80) 


then 


©*={<?:  \\A*y-  Beu\\k2<\\lVGBey\\k2).  (81) 
v)  Coprime  Factored  (Coupled):  If 

A, 


then 


G  = 


©*  = 


Be  +  Afl  WB 
Ae  +  A  AWA' 


6:  \\Aey  -  Beu\\k2  <> 


<  1  (82) 


WBu 


wAy 


I  Jr  2 


vi)  Coprime  Factored  ( Uncoupled ):  If 
Be  +  A  B  WB 


(83) 


G  = 


then 


Ae  +  AAWa’ 


As||jr_  ^  1»  llA^ljjr  5  1  (84) 


©*={«:  \\Aey  -  Bdu\\kl  <  \\WBu\\k2  +  ||  #^11*2}. 

(85) 


vii)  All  the  above  set  estimates  0*  have  the  property  that 


V.  Equation-Error  Set  Estimation  with 
Disturbances 

There  arc  many  ways  to  characterize  the  disturbance  cnvi 
ronment  both  in  terms  of  the  location  and  the  type  of 
disturbance.  To  simplify  the  discussion,  we  assume  that  the 
disturbance  is  located  additivcly  at  the  output,  as  given  by 
(16) 

y  =  Gu  +  u. 

The  most  common  type  is  the  stochastic  disturbance  which 
we  consider  in  this  section.  Deterministic  “worst-case”  types 
of  disturbances  are  discussed  briefly  in  Section  VI. 

A.  Stochastic  Additive  Disturbance 

Suppose  that  the  disturbance  v  is  a  zero-mean  quasi-sta- 
tionary  sequence  in  the  set 

r=  {v:Sj<a)<o2\JVH(e^)\\ 

=  0,  v  M  ST}  (92) 

where  WH(z)  is  a  stable  and  stably  invertible  transfer  func¬ 
tion.  Equivalently,  we  can  think  of  v  as  the  output  of  a  stable 
uncertain  linear-time  invariant  system  H  with  a  white-noise 
input  e.  Hence, 

u  =  He  (93) 

where  H  is  in  the  set  of  linear-time-invariant  systems  and 
e  is  in  the  set  of  stochastic  sequences  7/lloch  defined  as 
follows: 


0*  £  e[N]  £  0*. 

(86) 

Proof:  The  proof  of  the  property  0*  £  0*  for  all  the 
cases  above  is  similar  to  the  proof  for  Theorem  1 .  We  will 
show  it  for  case  vi)  only.  Let  6*  e  0*.  i.e.. 

Be.  +  A*fl  WB 

A  9’  +  WA 

(87) 

with 

II A *b  II  —  1  an(l  II A *A  ||  jrm  ^  1  - 

(88) 

Since  0*,A*a,  and  A*B  must  agree  with  the  measured  data 

A0.y  —  Bq.u  —  AB  WBu  Aa  WAy. 

(89) 

Now  take  the  /3-norm  and  apply  the  triangle  inequality  with 
(88),  9*  must  also  satisfy 

II  Ae.y  -  Be.u\\k2  <  II IVBU II „  +  II  WAy\\k2.  (90) 
Therefore, 

0*6(0:  \\A,y- Btu\\k7<,  \\WBu\\kl 

+  ll»^ll*2}  -  ©*  (9i) 

and  9*  5  0t.  D 

From  these  forms  it  is  straightforward  to  generate  the 
corresponding  quadratic  forms  for  computing  the  sets.  In 
those  cases,  when  the  right-hand  side  of  any  of  the  above 
inequalities  docs  not  depend  on  the  parameter  0,  the  center  of 
the  parametric  set  is  the  usual  least-squares  estimate,  e  g., 
1 32 1 


•^=  {A H  stable:  ||A„||^  <  1}  (94) 

^.och  -  {white  noisc  e:  S«(c*’)  =  ~  °2>  S«(u)  =  0. 

V  |  a;  |  <  7r,  bounded  fourth  moment} .  (95) 

The  disturbance  set  then  becomes 

*'=  {u  =  He:  HeJf7,  ee  tf'ttoeh}.  (96) 

Assuming  that  WH  and  a  are  known,  the  disturbance  set 
defined  above  is  otherwise  parameter-free.  One  can  compare 
this  set  description  to  &  which  contains  the  parametric 
transfer  function  Ge(z).  As  it  is,  the  disturbance  set  is 
perfectly  adequate  for  describing  a  sensor  noise.  However,  in 
the  case  of  a  general  disturbance  reflected  to  the  output,  the 
set  merely  serves  to  provide  an  upper  bound.  For  small 
disturbances  this  is  adequate,  but  the  set  is  potentially  conser¬ 
vative  otherwise.  For  a  more  complete  discussion  on  this 
matter,  see  [19]. 

We  now  have  the  following. 

Theorem  5:  Suppose  that  the  true  plant  which  generated 
{y,  u:  t  -  1,  -,/V}  has  the  structure  described  above 

Then 

i) 

0^  —  0.  w.p.  1  as  N  -*  00.  (97) 

ii) 

0*  <=  (98) 
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where  the  equation-error  sets  are  now  defined  as  follows: 
©n=  {«:  «N(W{A,y  -  B»d)]2) 

<  <?n({W„'IVcB6u)2)  +  oJ(l  +  QtaQa)\  (99) 

e.  =  {*:  «))2) 

<  <^(( Wff 1  wgb6 m)2)  +  o2(l  +  flX)}  (100) 

with 

°A  =  [«.  (101) 

iii)  In  the  frequency  domain 

(102) 

with 

/.(«)  =  ( |  AeG  -  Bt  | 2  -  |  WcBe  1 2)S„» 

+  M#|2(|//|2S„(«)  -  |  W„\2o2).  (103) 

Observe  that  both  the  finite-data  set  0N,  as  well  as  the 
infinite-data  set  0„,  depend  on  the  noise  intensity  a  and  the 
disturbance  weighting  transfer  function  WH>  whose  inverse 
acts  as  a  data  filter.  The  theorem  is  analogous  to  the  many 
prediction-error  based  parameter  estimators  in  the  sense  that 
for  a  sufficiently  long  data  length  N,  the  estimate  is  equal  to 
the  true  value  with  high  probability  [24],  In  our  case,  the 
finite-data  set  will  contain  0*  with  high  probability.  Part 
i)  of  the  theorem  means  that  for  each  0  e  0„,  there  is  a 
0Ne  Qn  close  to  it  as  N  increases.  More  precisely, 

inf  \\0-dN\\->0  w.p.  1  as  N -*  oo  (104) 

where  |j  •  ||  is  a  norm  on  ®! p. 

The  integrand  in  the  frequency-dom?in  expression  for  ©„ 
is  always  negative  provided  that  for  all  j  oj  |  <  t 


ranging  terms  and  filtering  by  Wtl  1  gives 

WHl{A*y  -  =  Ac  W"'WcB,u  +  Ah  A,e.  (107) 

Squaring  both  sides  and  taking  autocorrelation  at  r  =  0,  we 
get 

Be“)}2)  =  #((*GW,<'H'cBeu)2) 

+  ^((A„  Aee)2)  (108) 

where  the  cross  terms  (between  e  and  u)  are  zero  because  e 
and  u  are  independent.  Now  take  the  supremum  of  the 
right-hand  side  to  obtain  the  infinite-data  parameter  set 

<  sup  f^((Ac^'IfcB,U)2) 

+  £((A„/l#e)2)]j.  (109) 

To  evaluate  the  right-hand  side  above,  we  now  use  the 
assumptions  lAd^  te  1,  HA*]]^  <  1,  and  ee  to 

obtain 

sup  *((AC  Wi,'WcBeu)2)  =  *{{Wi'W0B,uf) 

8Acl,si 


sup  sup  <?((A„  A8e)2)  =  sup  6({A,ef) 


(Ill) 

=  *2IMJ  ^  (112) 
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,  Be 
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WG-?~ 
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^  ^-(IW'hIV  -  |tf|2s„). 

^ uu 


We  can  now  see  the  usual  effects  of  signal-to-noise  ratio.  As 
the  noise  power  a 2  increases,  the  “volume”  in  ©M  will 
increase.  Conversely,  if  Suu(u)  is  large  at  many  frequencies, 
0a,  will  shrink.  In  addition,  in  the  frequency  ranges  where 
I  WH(eiu)  |  >  //(e'“),  an  indication  of  poor  prior  informa¬ 
tion,  very  large-input  power  at  these  frequencies  is  required 
to  keep  0,,  small. 

Proof:  Under  the  assumptions,  the  true  system  can  be 
expressed  as 

y  =  ~ -(1  +  Ag  Wr)u  +  A„  WHe  (106) 
Ao 

for  some  ||Aj|  „•_<!.  || A,;|)  <  1.  and  ee  Rcar- 


=  ff2|^i+I>£)  (H3) 

=  a2(l+<?X).  (114) 

This  yields  the  set  0„  as  defined  in  the  theorem. 

Observe  that  e„  has  precisely  the  same  form  as  0„ 
except  that  the  operator  $  ( ■ )  is  replaced  everywhere  with  the 
sample  mean  £N(-).  To  show  (97),  recall  from  (24,  pp. 
34-35]  that  if  the  stochastic  part  of  x  can  be  described  as 
filtered  white  noise,  then  the  spectrum  of  an  observed  single 
realization  of  x,  computed  as  for  a  deterministic  signal, 
coincides,  with  probability  1,  with  that  of  the  process,  i.e., 

lim  SN{x*)  -*  lim  —  £  =  <?(*2)-  (115) 

N—o o  N—  oo  /V  i _  | 

The  conditions  for  this  convergence  are  that  x  is  a  quasi-sta- 
tionary  sequence  and  the  white  noise  has  bounded  fourth 
moment.  Note  that  since  u  and  y  are  assumed  to  have  finite 
power,  W"X(A ey  -  Be u)  and  YVcB9u  are  quasi-sta- 
tionary.  Thus,  the  convergence  in  (97)  holds. 

To  show  that  0*  c  0^,  we  use  the  frequency-domain 
expressions  in  iii).  Observe  that  the  frequency-domain  ex¬ 
pression  for  0m  can  be  obtained  by  substituting  y  -  Gu  + 
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He  in  (100)  to  get 

e.  =  {«:  -  Be)u  +  IV/{1A9  He\2) 

<  £((WjWcBtuf)  4-  <r2|MJ.V}).  (116) 

Now  use  the  fact  that  w  and  e  are  independent  to  simplify, 
then  apply  Parseval’s  theorem.  In  the  frequency-domain  ex¬ 
pression,  the  assumption  H  e  Jt’  means  that 

I  W(e'“)|2S„(“0  -  |^„(O|V<0,  V<a  (117) 

and  6  e  0*  means  that 

\AaG-  Bg\2-  \WGBe\2<Q,  vw.  (118) 

Thus,  6  e  0*  guarantees  that  /<,(<*>)  is  negative  for  all  fre¬ 
quencies,  and  hence,  0*  £  0*.  □ 

1)  Example  of  Bias  Estimation:  As  an  illustrative  exam¬ 
ple,  consider  estimating  a  constant  in  noise 

y{t)  =  b0  + e(t).  (119) 

In  this  case,  fVG(z)  =  0  to  reflect  the  absence  of  nonpara- 
metric  uncertainty,  and  H(z)  =  1.  In  addition,  WH(z)  =  1, 
and  HeJf  is  satisfied.  If  ee  ^(och,  then  the  set  estimate 
for  b0  is 

©tv  —  {&:  (b  -  i)2  <  <r2  -  <?„((y  -  b)2))  (120) 

where  b  -  £N{y).  For  large  N,  the  right-hand  side  behaves 
as  a 2  -  ff0J ,  where  a02  is  the  true  noise  variance.  Note  that 
the  limit  set  0*,  in  this  case,  is  the  point  b0.  Since  b -*  b0 
as  N  -*  oo,  we  see  that  0*  c  0^  as  stated  in  the  theorem. 
Furthermore,  as  the  bounding  variance  o  approaches  a0,  the 
set  0„  becomes  a  point.  Observe  that  0M  does  not  shrink  to 
a  point  when  there  is  nonparametric  uncertainty,  i.e.,  Wc(z) 
*0. 

B.  Computing  the  Equation-Error  Set 

For  computing  QN,  we  have  the  following  result 
Theorem  6:  As  in  Theorem  3,  define  the  vector  sequences 
<t>,  4>jy,  and  4>u-  Then: 

i)  Qn  can  be  expressed  in  the  quadratic  form 

0„=  {6:  OTr„0  -  2(3^0  +  aN<  0}  (121) 

where  a 0NeKp,  and  r/v,e®p*p  are  given  by 

«*-  **((»'*  jO2)  (122) 

*N{W*)Wy))  (i23) 

rN  = 

a2 1  0 

-  r.  .  (124) 

o  ) 

ii)  Provided  r„'  exists,  another  expression  is 

0N=  |i:(d-j;)TrN(j-M<  v„]  (125) 

0N=  r„X  (126) 


~  Pn  01  n-  (127) 

iii)  When  PN  >  0,  0N  is  an  ellipsoid  in  Ifi  p  and  when  Ts. 
is  indefinite,  QN  is  a  hyperboloid  in  ifip. 

The  proof  of  the  above  proceeds  along  the  same  lines  as 
that  of  Theorem  3,  and  is  omitted. 

The  infinite-data  parameter  set  estimate  can  also  be  ex¬ 
pressed  in  a  form  identical  to  that  for  the  finite-data  set 


0„  =  [O'  at  -  2(1T0  4-  0tT0  <  0}  (128) 

where  a  6 1;2,  0e(?lp,  and  relSp*p  are  given  by 

«  =  j((W„'y)2)  -  a2  (129) 

e  =  *(W*)Wy))  (130) 

-  ^  ° 

o  *(Ww0*h)(wz'wc*u)t)  '  131) 

C.  Example  of  On  with  Disturbance 


The  example  system  is  as  before  with  G  given  by  (23), 
and  fVc  given  by  (25).  The  disturbance  dynamics  is 

H(z)  -  (,32) 

and  the  disturbance  weight  is 

WH{z)  =  ~H(z)  (133) 

bH 

where  6He  (0,  1)  is  a  parameter  chosen  by  the  user. 

The  disturbance  u  is  simulated  as  the  output  of  H  driven 
by  e,  a  sequence  of  independently  distributed  Gaussian  vari¬ 
ables  with  zero  mean  and  variance  a2.  Three  series  of 
experiments  are  carried  out  to  study  the  effects  of  noise 
power  (choice  of  a),  mismatch  between  H  and  WH  (choice 
of  6h),  and  length  of  data  record  (choice  of  N).  In  the  first 
two  experiments,  the  input  u  is  a  linearly -spaced  sinesweep 
from  0.01  to  0.5  rad /s  over  102.3  s,  giving  N  =  1024  data 
samples.  In  the  third  experiment,  N  is  varied. 

To  study  the  effects  of  noise  power,  a  is  varied  in  this 
experiment.  As  suggested  by  Theorem  5,  the  parameter  set 
estimate  should  expand  as  a  increases.  This  is  supported  by 
Fig.  7,  where  QN  is  plotted  for  a  =  0.1, 0.2,  and  0.4.  Note 
that  in  all  cases  shown  here,  0*  c  QN. 

In  Fig.  8,  the  value  of  oH  is  varied  from  0.6  to  1.0. 
Again,  as  suggested  by  Theorem  5,  as  the  mismatch  between 
H  and  WH  becomes  larger,  i.e.,  |  hH  |  becomes  smaller,  ©v 
grows. 

The  effects  of  different  data  record  lengths  are  studied  in 
the  last  experiment.  For  the  cases  of  N  =  1024  and  2048 
with  o  =  0.5,  and  =  1.0,  0*  is  not  in  0„.  This  is  still  in 
agreement  with  our  results  because  in  the  stochastic  distur¬ 
bance  case  0*  is  only  guaranted  to  be  in  as  N  tends  to 
infinity.  As  shown  in  Fig  9,  (-)*  is  in  0V  for  N  =  4096. 
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Fig.  7.  QN  for  different  values  of  a  (N  =  1024,  &H  =  0.8), 


parameter  a 

Fig.  8.  Qn  for  different  values  of  bH  ( a  -  0.2,  N  =  1024). 

VI.  Some  Extensions 

In  this  section,  we  first  consider  the  extension  of  our 
results  for  the  equation-error  set  estimates  to  the  output-error 
set.  We  then  consider  disturbances  which  are  deterministic  in 
nature  rather  than  stochastic,  as  considered  in  the  previous 
section. 

A.  Disturbance-Free  Output-Error  Set  Estimation 

The  results  obtained  for  the  equation-error  set  in  Section 
IV  can  be  repeated  mutatis  mutandis  for  the  output-error 
set,  but  for  the  notable  exception  of  forming  a  quadratic  set 
for  computational  purposes  none  exists  for  output-error  iden¬ 
tification  (24], 

Theorem  7:  Suppose  the  measured  data  {y,u:  t  = 
1.’ '  ‘ .  bl]  is  generated  from  y  =  Gu  with  G  e  $ .  Then  the 
following  holds: 

0*ees,(w]c9«  v*g[i,/v],vngB3  (134) 

where  B"'[  N]  and  0*'  are  (he  output-error  set  estimates 


pmiiKier  • 

Fig.  9.  Qn  for  different  values  of  N  (o  =  0.5,  bH  =  1.0). 


given  by 


jfl: 

Be 

II  B, 

\ 

y  — -~u 

1 

Ae 

H  I  A' 

N 

*2/ 

e-[N]-  ner.  (136) 

1 

Remark:  We  refer  to  0£e  and  0°'[/V]  as  output-error 
parameter  sets  because  the  output-error  term  y  — 
(Be/  A t)u  appears  in  their  descriptors. 

Proof:  The  proof  of  0*  £  0O,[A/J  £  ©£'  is  identical 
to  the  one  for  Theorem  1 .  □ 

The  sets  0*  and  Qke  are  both  worst-case  estimates,  both 
contain  0*,  but  they  are  not  necessarily  the  same  sets  for 
identical  input  sequences.  Another  major  difference  is  that 
both  sides  of  the  inequality  in  Qk  are  affine  in  6 ,  whereas  in 
Qkr  they  are  linear  fractional  in  6.  The  former  property 
makes  it  very  easy  to  compute  0*.  as  has  been  shown, 
whereas  the  latter  makes  it  difficult  to  compute  the  output-er¬ 
ror  sets,  as  usual. 

B.  Deterministic  Additive  Disturbances 

So  far,  we  have  only  considered  stochastic  disturbances. 
We  now  briefly  examine  the  effect  of  deterministic  distur¬ 
bances. 

Suppose,  as  before,  that  the  true  system  is 

y  =  Gu  +  He  (137) 

with  Ce  ?  and  //e/  as  previously  described.  We  now 
consider  the  following  deterministic  set  which  describes 
quasi-stationary  sequences  with  bounded  spectra: 

=  le(0-  See(u>)  <  0^.  Vt«|  (138) 

We  then  obtain  the  following. 

Theorem  8:  If  ee  then 

e*  =  {«:  p*([Wu'{Aey-  B,u)\2) 

*  h(WWr,Bgu)2)  +  Vcx/l  +  <?X  j  (139) 
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and 

0*  c  lim0t.  (140) 

k-^aa 

Proof:  The  proof  of  (140)  proceeds  <he  same  as  Theo¬ 
rems  1  and  4.  Let  d*  e  0*,  then 

~  &9‘u)  =  Wfj  1  Wc Bg.u  +  A/(  Ag.e, 

(141) 

After  squaring  both  sides  and  taking  the  sample  averages,  the 
Schwarz's  inequality,  ||Acllj^  ^  1,  and  (|  A  w  ||  ^  <  1  are 
applied  to  obtain 

+  V*t{(At.e?)  .  (142) 

Now  let  k  oo,  we  have 
0*e\6:  il*([WZ'{Aty-Btu)]Z) 

*  \I*(WIVgB9u)2)  +  ojpec  V~l  +  Wa  }  (143) 

and  0*  <=l  lim*  □ 

In  both  cases  of  stochastic  and  deterministic  disturbances, 
the  limit  set  0*  is  contained  in  the  set  estimate  as  the  data 
length  tends  to  infinity.  However,  in  the  deterministic  case 
here,  the  probabilistic  convergence  need  not  be  considered. 
The  reason  that  both  cases  can  be  handled  in  the  same  way  is 
because  a  common  framework  is  used  for  deterministic  and 
stochastic  signals,  [sec  (5)  and  (6)].  Note  that  instead  of  using 
W  to  describe  the  deterministic  disturbance  e,  we  can 
also  use 

={*('):  l'm  Y.  e(t)2  a^sj  (144) 
l  ‘v~*~  Af  ,=  i  ) 

to  describe  e  and  obtain  results  similar  to  Theorem  8. 

VII.  Concluding  Remarks 

The  set-membership  approach  to  system  identification  starts 
with  the  assumption  that  the  underlying  true  system  which 
generated  the  measured  data  is  in  a  known  set  characterized 
by  some  unknown  parameters  and  unknown  but  bounded 
nonparametric  dynamics.  We  then  derived  set  estimates  for 
these  unknown  parameters.  In  the  disturbance-free  case,  the 
set  estimate  has  the  property  that  it  always  contains  the  limit 
set.  In  the  presence  of  stochastic  disturbances,  the  set  esti¬ 
mate  is  shown  to  have  the  property  that  it  contains  the  limit 
set  with  probability  one  as  the  data  length  tends  to  infinity. 

The  set  estimates  derived  in  this  paper  also  have  some  nice 
properties  for  computation.  For  the  equation-error  estimates, 
the  set  expressions  are  quadratic  in  the  parameters.  Thus,  the 
set  estimates  arc  either  ellipsoids  or  hyperboloids  in  the 
parameter  space.  Furthermore,  these  sets  arc  easily  obtained 
by  computing  averages  of  the  filtered  input -output  data. 
However,  when  the  output-error  form  is  used  in  the  set 


estimate,  these  nice  properties  are  lost,  which  is  typical  with 
output-error  identification. 

The  next  step  is  to  use  these  set  estimates  with  a  robust 
on-line  control  design  procedure.  One  approach  would  be  to 
bury  the  parameter  uncertainty  in  another  nonparametric 
uncertainty  by  finding  an  overbounding  frequency-dependent 
weighting  function.  This  is  a  potentially  very  conservative 
approach.  Alternatively,  the  minimax  approach  in  (22 J  and 
[21j  presents  a  robust  control-design  procedure  to  handle  the 
specific  type  of  parameter  uncertainty  as  represented  by  the 
ellipsoidal  sets.  This  is  a  current  topic  of  our  research. 
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Abstract  This  paper  presents  a  control  design  method 
for  continuous-time  plants  whose  uncertain  parameters  in 
the  output  matrix  are  only  known  to  lie  in  an  ellipsoidal 
set.  The  desired  control  is  chosen  to  minimize  the  maxi¬ 
mum  linear  quadratic  regulator  (LQR)  cost  from  all  the 
plants  with  parameters  in  the  given  set.  Although  no  par¬ 
ticular  form  is  assumed  for  the  minimax  control,  it  turns 
out  that  it  is  the  LQR  control  for  one  of  the  plants  in 
the  set,  the  worst-case  plant.  By  defining  an  appropriate 
mapping,  which  maps  an  element  from  the  given  ellip¬ 
soidal  set  to  an  element  of  the  same  set,  the  existence  of 
this  worst-case  plant  is  proved.  A  simple  heuristic  algo¬ 
rithm  used  to  compute  the  worst-case  plant  is  also  given. 


1  Introduction 

A  problem  of  great  interest  in  control  theory  is  the  design 
of  a  controller  which  can  guarantee  some  level  of  perfor¬ 
mance  in  the  presence  of  plant  parameter  uncertainty. 
Kharitonov’s  theorem  provides  a  necessary  and  sufficient 
analysis  test  for  determining  the  robust  stability  of  poly¬ 
nomials  with  perturbed  coefficients,  however,  there  are 
few  results  that  exploit  Kharitonov’s  theorem  for  synthe¬ 
sizing  robust  controllers,  c.g.,  [7]  and  [12].  Another  ap¬ 
proach  to  this  problem  is  to  define  a  set  of  nominal  values 
oi  the  uncertain  parameters  and  consider  deviations  from 
these  nominal  values.  A  comprehensive  survey  of  the  dif¬ 
ferent  parameter  space  methods  for  robust  control  design, 
as  opposed  to  frequency  domain  methods,  can  be  found 
in  [23]. 

The  technique  of  solving  control  problems  as  minimax 
optimization  problems  is  the  basis  of  the  so-called  “H„ 
optimal  control  theory.”  In  the  standard  H,*,  problem, 
the  control  input  is  chosen  to  minimize  the  norm  of  the 
output  and  the  exogenous  input  is  chosen  to  maximize  it 
[2].  Along  this  line,  the  structured  singula;  value  (p)  syn¬ 
thesis  method  is  used  to  find  controllers  which  minimize 
a  Hoo  objective  subject  to  plant  perturbations,  c.g.,  see 
[8],  [9],  and  references  therein.  In  [20],  a  game  theoretic 
approach  is  used,  where  the  control,  restricted  to  a  func¬ 
tion  of  the  measurement  history,  plays  against  adversaries 
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composed  of  the  process  and  measurement  disturbances, 
and  the  initial  states.  Another  example  of  solving  control 
problems  as  minimax  problems  is  [18],  which  presents  a 
controller  design  method  to  minimize  the  weighted  sum 
of  the  maximum  linear  quadratic  gaussian  (LQG)  per¬ 
formance  objectives  over  a  set  of  worst  plant  parameter 
changes. 

The  approach  of  using  set-membership  to  describe 
plant  parameter  uncertainty  has  gained  popularity  in  re¬ 
cent  years,  c.g.,  [14],  [16],  [26],  [3],  [17],  and  references 
therein.  This  approach  of  parameter  identification  is  orig¬ 
inated  from  early  works  of  [22]  and  [5],  where  the  set  of 
possible  system  states  compatible  with  the  observations  is 
shown  to  be  an  ellipsoid.  Motivated  by  ellipsoidal  bounds 
on  plant  parameters,  we  pose  the  following  robust  control 
problem;  given  that  the  unknown  parameters  in  the  out¬ 
put  matrix  of  the  plant  are  known  to  lie  in  an  ellipsoid, 
find  the  control  which  minimizes  the  maximum  LQR  cost 
from  all  plants  with  parameters  in  the  given  set.  Viewed 
in  terms  of  game  theory,  the  control  and  plant  uncertainty 
are  strategies  employed  by  opposing  players  in  a  game, 
where  the  control  is  chosen  to  minimize  the  LQR  r^st 
and  the  plant  uncertainty  is  chosen  to  maximize  it  As  a 
special  case  of  our  problem,  finding  the  finite-hor:  .on  con¬ 
trol  for  a  discrete  finite-impulse  response  (FIR)  plant,  was 
solved  in  [15].  In  that  case,  it  was  shown  that  the  mini¬ 
mization  is  a  convex  optimization  problem.  In  this  paper, 
we  are  generalizing  the  robust  control  design  problem  to 
find  the  infinite-horizon  controls  for  continuous  plants. 

The  assumption  that  the  output  matrix  in  the  plant  de¬ 
scription  contains  all  the  uncertainty  deserves  further  dis¬ 
cussion.  First,  this  is  a  natural  extension  of  the  discrete 
FIR  finite-horizon  problem  solved  in  [15].  In  the  discrete 
case,  FIR  model  sets  can  be  identified  from  input-output 
data  of  a  plant,  i.e.,  the  coefficients  of  the  FIR  model  are 
identified  to  belong  to  a  set.  This  is  particularly  attrac¬ 
tive  when  a  bounded  noise  model,  often  a  more  realis¬ 
tic  assumption  than  a  statistical  noise  model,  is  used  in 
the  identification  [19],  In  the  continuous  case,  Laguerre 
models  can  be  used  so  that  the  identification  is  reduced 
to  estimating  the  Laguerre  coefficients  [25].  Uncertainty 
in  the  Laguerre  coefficients  can  then  be  described  by  set 
membership  of  the  output  matrix.  Second,  by  limiting 
uncertain  parameters  to  the  output  matrix,  we  simplify 
the  analysis  and  gain  more  insights  into  the  nature  of  the 
solution 

The  paper  is  organized  as  follows,  after  stating  the 
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problem  in  the  next  section,  the  minimax  control  is 
proved  in  section  3  to  be  the  LQR  control  designed  for  the 
worst-case  plant  from  the  given  ellipsoidal  set.  By  defin¬ 
ing  an  appropriate  mapping,  which  maps  an  element  of 
the  given  set  to  an  element  of  the  same  set,  the  existence 
of  this  worst-case  plant  is  proved.  In  section  4,  a  simple 
algorithm  used  to  compute  the  worst-case  plant  is  given. 
A  two-mass-one-spring  example  is  used  in  section  5  to  il¬ 
lustrate  the  ideas  presented.  The  paper  concludes  with 
some  remarks  in  section  6. 


2  Problem  Formulation 

Consider  the  following  family  of  systems 

i(t)  =  Ax(f)  +  bu(t),  x(0)  =  x0  (1) 

y(t)  =  cTx(t),  (2) 

where  A,  b,  and  xo  are  fixed  and  given,  and 

ceO={O:(0-  6c)tR(6  -  0C)  <  1,  R=RT  >  0}  . 

(3) 

For  a  given  control,  u  :  1R+  — *  1R,  and  a  fixed  c£0,  the 
LQR  cost  is  defined  to  be 

-f(u.c)-/  [ru(t)2  +  y(t)2]  dt.  (4) 

Jo 

We  assume  that  (A,  b)  is  controllable  (or  at  least  stabiliz- 
able)  and  (c,  A)  is  observable  (or  at  least  detectable)  for 
all  c  in  0.  The  robust  control  design  problem  is  to  find  a 
control  u  :  IR+  —  IR  that  solves  the  following  minimax 
problem: 

min  max  J(u,c).  (5) 

v  c€0 

Since  no  particular  form  is  assumed  for  the  control  «,  such 
as  linear  state- feedback,  the  minimization  in  (5)  is  over 
all  possible  u’s.  Note  also  that  we  chose  the  initial  time 
t  =  0  for  convenience  only,  the  problem  can  be  posed  at 
any  initial  time  t  =  t0.  Therefore,  one  can  design  a  new 
controller  each  time  ©  gets  updated. 

The  cost  objective  in  (4)  and  the  ellipsoidal  set  in  (3) 
lead  to  another  interesting  interpretation  for  the  minimax 
problem  in  (5)  once  we  rewrite  (4)  as 

rCO 

J(u,c)=  /  [ru(t)2  +  xT(<)ccTx(t)]  dt.  (6) 

Jo 

Now,  instead  of  saying  that  we  are  designing  a  controller 
for  a  set  of  uncertain  plants  described  by  (1)  through 
(3),  we  can  also  say  that  we  are  designing  a  controller 
for  a  set  of  uncertain  objective  functions.  (This  interpre¬ 
tation  contrasts  with  the  standard  LQR  design  where  a 
controller  is  obtained  for  fixed  weighting  matrices.)  Note 
that  cTx(t)  is  a  dot  product,  so  it  depends  on  the  an¬ 
gle  between  c  and  x(t).  Geometrically,  the  set  0  sweeps 
out  a  “cone”  (with  a  curved  base)  of  possible  c’s.  Thus, 


we  can  interpret  0  as  a  set  of  “view  angles"  from  which 
we  calculate  the  cost.  The  inimmax  control  from  (5)  is 
therefore  robust  to  all  these  “view  angles.”  This  interpre¬ 
tation  is  interesting  since  in  practice  we  seldom  look  at 
performance  from  just  one  angle. 

3  Minimax  Solution 

To  solve  the  minimax  problem  in  (5),  recall  from  [6,  pages 
274-282]  that  (u*,c")  is  a  saddle  point  if 

J(u\c)<J(u\cm)  <  J(u,c‘)  (7) 

for  all  u  :  IR+  — < •  IR  and  c  €  ©.  In  that  case,  we  have 

(u*,c‘)  =  arg  min  max  J(u,c)  =  argmaxmin,/(u,c). 
u  c€Q  c€0  u 

(8) 

Our  goal  in  this  section  is  to  prove  that  there  always  exists 
such  (u*, c*)  for  (5). 

From  LQR  control  theory,  the  second  inequality  in  (7) 
is  true  if 

u*  =  ulqr(c'),  (9) 

where  uiqr(c')  denotes  the  LQR  control  designed  for  the 
plant  in  (1)  with  c  —  c*  in  (2).  It  follows  that  the  first 
inequality  in  (7)  is  also  true  if 

c’  =  argmaxJ  (uLQR(c’),c) .  (10) 

Thus,  if  c*  exists  for  (10),  the  minimax  problem  in  (5)  is 
solved  by  (9).  Note  that  the  existence  of  c*  is  not  obvious 
because  c*  must  have  the  property  that  when  vlqr(c’) 
is  applied  to  each  c  6  0,  the  maximum  cost  occurs  at  c’. 

We  now  express  the  LQR  cost  in  (10)  in  a  more  con¬ 
venient  form.  Since  (A,  b)  is  stabilizable  and  (c,  A)  is 
detectable  for  all  c  in  0,  for  each  c  £  0  there  is  an  asso¬ 
ciated  state-feedback  control  ur,<j/?(c)  given  by 

vlqr(c)  =  -Kcx,  (11) 

where 

Kc  =  hTPc  (12) 

and  Pc  satisfies  the  algebraic  Riccati  equation 

AtPc  +  PcA- -PcbbTPc  +  ccT  =0.  (13) 

r 

We  will  use  Xc  to  denote  the  solution  of  the  associated 
Lyapunov  equation, 

(A  -  bKc)Xc  +  XC(A  -  bf<c)T  +  x0xl  =0,  (14) 

where 

SCO 

Xc  -  I  f(A-«G).XoXTe(*--»Kc)r,  (16) 

Jo 


')() 


The  LQR  cost  in  (10)  can  now  be  expressed  as 
-/(u£.Qr(c*),c) 

(>*u£,<3r(02  +  y2l  dt 

[rKc~xxT  Kj.  +cTriTc]  dt 
o 

rKc-e^A~iKc"  J'loijet*-****  )T,KJ.  dt 

I  00 

+  /  cT<SA-'K*  KoxW*-***  )T‘c  dt 

Jo 

=  rKc.Xe.Kj. +CTXC.C.  (16) 

For  a  given  c*,  Kc-Xc-Kj.  in  (16)  is  fixed.  Thus,  the 
maximization  in  (10)  becomes 

c*  =  argmaxcTXc.c.  (17) 

Note  that  the  feedback  gain  Ke •  does  not  depend  on  the 
initial  condition  x<j,  but  the  Lyapunov  solution  Xc»  does. 
Therefore,  the  solution  c*  is  a  function  of  x0.  However, 
this  dependence  on  xq  can  be  removed  if  we  start  with 
the  assumption  that  xo  is  a  random  vector  with  known 
mean  m  and  covariance  C  and  the  objective  in  (4)  is  an 
expectation  over  xo.  In  that  case,  Xc-  is  the  solution  of 
(14)  with  XqxI  replaced  by  C  +  mmT. 

Our  ultimate  goal  is  to  find  c*  in  (17),  but  we  must  first 
prove  that  such  c*  always  exists.  To  do  that,  we  define 
the  mapping  /  :  c  €  0  — ►  c  €  0, 


f{c)  =  c 

=  argmaxcTXfc,  (18) 

where  X(  satisfies  the  Lyapunov  equation  associated  with 
c  as  in  (14).  It  was  shown  in  (15]  that  the  solution  of  (18) 
is  given  by 

c  =  TA~?z  +  6C  €  ©6,  (19) 

where 

R  =  TATt  (20) 

i  =  (ft  -A/)”1/?  (21) 

A  =  maxA^_^T  ])  (22) 

ft  =  R-SX'R-l  (23) 

P  =  -R-$Xt0c  (24) 

06  =  {9  :  (6  -  9ef  R(0  -  de)  =  1}  (25) 


(0j  is  the  boundary  of  0.)  Therefore,  the  mapping  /  con¬ 
sists  of  two  parts.  First,  it  takes  the  given  c  and  produces 
X(  via  equations  (12)  through  (14).  Then  c  is  given  by 
(19). 

To  show  that  c*  exists  in  (17)  is  equivalent  to  showing 
that  a  fixed  point  e*  exists  for  /, 


/(<-*)  =  c*. 


(26) 


To  do  that,  we  need  a  lemma  extracted  from  [11]  and  a 
simple  form  of  Brouwer’s  Theorem  [13,  pages  366-367] 

Lemma  1  If  (A,  b)  is  stabtli2able,  then  over  any  region 
where  (c,  A)  is  detectable,  the  algebraic  Rtccalt  equation 
solution  Pc  in  (IS)  is  continuous  in  ccT . 


Proof  of  Lemma  1  Consider  the  matrix-valued  func¬ 
tional 

g  (P,ccT)  =  AtP+  PA-  -PbbTP  +  ccT.  (27) 

r 

For  any  c,  Pc  satisfies  (13),  so  g(Pe,ccT)  =0.  As  a 
quadratic  function  in  P  and  a  linear  function  in  ccT ,  the 
functional  g  is  infinitely  differentiable,  and  its  derivative 
with  respect  to  P  at  the  point  ( Pc,ccT )  is  the  linear  op¬ 
erator  given  for  any  matrix  Z  by 

DgP(Z)  =  (A-bKe)TZ  +  Z(A  ~bKc).  (28) 

Since  Ke  is  stabilizing,  the  operator  Dgp  is  nonsingu¬ 
lar  by  Lyapunov’s  equation.  Therefore,  from  the  implicit 
function  theorem  (see,  e.g.,  (21,  pages  375-380]),  there  ex¬ 
ists  an  infinitely  differentiable  matrix-valued  function  4' 
such  that 

Pc  =  tf(ccT).  (29) 

Thus,  Pc  is  continuous  in  ccT.  O 

Theorem  2  (Brouwer’s  Theorem)  Let  C  be  a  com¬ 
pact,  convex  subset  of  IRn.  Then  any  continuous  function 
f  :  C  —*  C  has  at  least  one  point  c*  such  that  /(c*)  =  c* . 

The  existence  of  c*  in  (17)  can  now  be  guaranteed  by 
the  following  theorem. 

Theorem  3  (Fixed  Point)  The  mapping  f  defined  m 
(18)  is  continuous  in  c  and  it  has  a  fixed  point. 


Proof  of  Theorem  3  First,  we  need  to  show  that  the 
mapping  from  c  to  Xe  is  continuous. 

1.  Let  c  =  c  in  (12)  through  (14).  By  Lemma  1,  P:  of 
(13)  is  continuous  in  ccT .  Since  each  element  of  ccr 
is  simply  a  product  of  elements  from  c,  ccT  is  contin¬ 
uous  in  c.  By  the  continuity  of  composite  functions, 
Pe  is  continuous  in  c. 

2.  of  (12)  is  continuous  in  P{,  thus  it  is  continuous 
in  c. 

3.  By  the  implicit  function  theorem  (similar  to  the  proof 
of  Lemma  1),  X(  is  continuous  in  Ke.  By  the  con¬ 
tinuity  of  composite  functions,  Xe  is  continuous  in 

c. 
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Second,  we  need  to  show  that  the  mapping  from  Xe  to  c 
is  also  continuous. 

1.  Both  Cl  and  0  in  (23)  and  (24)  are  continuous  in  Xe. 
Since  each  eigenvalue  of  a  matrix  is  continuous  in 
the  elements  of  the  matrix  (see,  e.g.,  (10,  pages  191- 
192]),  A  in  (22)  is  continuous  in  Xe-  Thus,  by  the 
continuity  of  composite  functions,  A  is  continuous  in 
c. 

2.  Each  element  of  (Cl  —  A/)-1  is  given  by  its  cofactor 
divided  by  det(fl— A/).  The  cofactors  and  det(0— A/) 
are  sums  of  products  of  elements  of  Cl  -  XI.  Thus, 
(fl  —  A/)-1  is  continuous  in  c,  which  implies  z  in 
(21)  is  continuous  in  c  also.  (Exception  is  when  fl  — 
A I  is  singular,  which  is  treated  in  [15].  However, 
continuity  is  not  affected.) 

3.  c  in  (19)  is  continuous  in  c. 

Therefore,  the  mapping  /  from  c  to  c  is  continuous,  and 
by  Brouwer’s  Theorem  it  has  at  least  one  fixed  point.  □ 

The  existence  of  a  saddle-point  solution  for  the  mini¬ 
max  problem  in  (5)  is  stated  in  the  following  theorem. 

Theorem  4  (Existence)  There  exists  at  least  one 
(u*,c*)  suck  that  (7)  is  true  and  the  minimax  problem 
in  (5)  has  a  saddle-point  solution.  If  there  are  more  than 
one  (u,c)  which  satisfy  (7),  then  their  associated  LQR 
costs  must  be  equal  and  any  one  of  the  solutions  is  equally 
valid. 

Proof  of  Theorem  4  From  Theorem  3,  we  know  that 
(10)  has  at  least  one  fixed  point.  Therefore,  (7)  has  at 
least  one  saddle-point  solution.  To  show  that  two  fixed 
points  of  (10)  must  have  the  same  LQR  cost,  assume  that 
there  exist  («i,ci)  and  (u2,c2)  such  that 

J(«i,c)  <  J(ui,ci)  <  7(u,ci),  V  u,  c  (30) 

and 

J(u2,c)  <  J(u2,C2)  <  J(u,c2),  V  u,  c.  (31) 
Then  let  c  =  C2  and  u  =  «2  in  (30),  we  get 

</(« r ,  c2)  <  J(t»i,d)  <  J{u2,c\)  <  J(u2,c2)  (32) 

or 

J(uucx)<J(u2,c2).  (33) 

Similarly,  let  c  =  ci  and  u  =  ut  in  (31),  we  get 

J{u2,cx)  <  J(u2,c2)  <  J(ui,c2)  <  J(«],ci)  (34) 

or 

J(v2,c2)  <  J(ui,ci).  (35) 

Therefore,  (33)  and  (35)  imply 

■/(«!.  ci)  =  J(u2,c2).  (36) 


□ 

This  section  can  be  summarized  as  follows;  a  fixed- 
point  solution  c*  exists  for  (10)  and  the  solution  to  the 
minimax  problem  in  (5)  is  given  by  (9).  We  now  turn  to 
the  computation  of  c*. 


4  Fixed-Point  Computation 


Before  describing  our  simple  heuristic  algorithm,  we 
should  point  out  that  there  exist  many  algorithms  to 
compute  Brouwer  fixed  points  (see  e.g.,  [l]  and  [24].) 
Although  these  algorithms  can  guarantee  that  the  fixed 
points  will  be  found,  they  are  known  to  have  combina¬ 
torial  complexity.  In  comparison,  we  have  no  guarantee 
that  our  algorithm  will  converge,  but  in  many  cases  that 
we  have  tried,  it  usually  converges  in  less  than  10  itera¬ 
tions. 

The  goal  of  the  iterative  algorithm  below  is  to  find  cjt 
such  that  the  distance  between  c*  and  ct  =  /(ct),  as  de¬ 
fined  in  (18),  is  small,  t.e.,  a  fixed  point.  Given  c*  and  c i- 
at  the  fcth  iteration,  steps  6  through  8  below  are  designed 
to  find  ci+l  and  ct+i  •  The  algorithm  accomplishes  this  by- 
doing  a  local  minimization  over  a  set  of  candidate  points, 

{Pi,  i  =  1 . N}.  Let  {p„  i  =  1,  be  N  -  1 

equally-spaced  points  between  c*  and  c*,  with  p,v  =  ct 
(see  Figure  1).  Vectors  are  then  drawn  from  6C  to  each 
p,-,  until  they  intersect  Qt  at  points  {p,,  t  =  l,...,iV}, 
where 


Pi  =  yw  +  8C 


(37) 

(38) 

(39) 


Next,  we  compute  p,  =  /(p.)  in  step  7.  After  comparing 
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the  distances  ||p,  —  p,||2,  the  pj  and  p,  with  the  minimum 
distance  become  ct  +  i  and  c*  +  i,  respectively. 

A  Heuristic  Algorithm 

1.  Define  the  mapping  /  from  c  to  c: 

compute  in  (18)  using  (13),  (12),  and  (14)  then 
compute  c  using  (19); 

2.  Jt  *—  0; 

3.  Let  ci  be  a  random  point  on  ©i,; 

4.  Compute  ci  =  /(c i); 

5.  it  <—  k  +  1; 

6.  Compute  (p,-,  *  =  l,...,Af)  on  ©t  using  (37); 

7.  Compute  pi  —  /(p.)  for  »  =  1, . . . ,  N\ 

8.  Compute 


then 

j  =  argmin||pi  -p,||2 

(40) 

Ck  +  l  =  Pj 

(41) 

^t+i  =  Pj  I 

(42) 

9.  If  ||c*  +  , 

ct+i||2  >  i,  go  to  step  5. 

Note  that  there  is  no  guarantee  that  ||c*  —  c*||2  < 
||c*  +  i  —  ci+i||2,  so  we  don’t  have  a  convergence  proof  for 
this  algorithm.  However,  with  e  =  0.001,  this  algorithm 
usually  converges  in  less  than  10  iterations. 


Figure  2:  Two-mass-one-spring  system. 


The  initial  condition  is  r0  =  [  1  -10  0  ]T,  which 

means  the  masses  are  displaced  toward  each  other.  For 
the  ellipsoidal  set  in  (3),  we  use  0C  =  [  0  1  0  l]T 

and  R  =  /.  Thus,  the  output  y  is  nominally  the  sum  of 
the  position  and  velocity  of  the  second  mass,  but  c  can 
still  be  anywhere  within  the  unit  ball.  We  choose  r  =  1  in 
the  objective  and  N  =  4  in  the  fixed-point  algorithm.  For 
the  stopping  criterion,  t  =  0.001  is  used.  The  algorithm 
converges  in  5  iterations. 

Table  1  shows  the  cost  matrix  for  this  example,  where 
clqr  is  the  element  in  ©  which  maximizes  the  cost  for 
u  =  «L<3«(^e)-  As  expected,  the  control  u  =  ulqr(6c) 
applied  to  c  =  0C  gives  the  lowest  cost  for  this  control, 
5.6,  but  its  cost  can  be  quite  high  at  other  c's  such  as 
clqr  and  c*.  In  comparison,  the  control  u  =  u/.qk(c*) 
applied  to  c  =  0C  gives  a  slightly  higher  cost  (but  this 
may  not  be  the  lowest  cost  for  this  control  as  it  is  likely 
that  another  c  achieves  the  minimum)  while  keeping  the 
maximum  cost  to  13.4,  as  compared  to  a  maximumof  17.1 
for  u  =  VLQn(0c).  Therefore,  this  example  illustrates  that 
by  giving  up  some  performance  at  the  nominal  plant  6r, 
we  gam  some  performance  back  for  other  plants  in  the 
set. 


5  Example 


We  will  use  the  two-mass-one-spring  system  described  in 
[4]  in  our  example.  This  system,  shown  in  Figure  2,  can 
be  represented  in  state-space  form  as 


'  0 

0 

I 

i 7 

0 

0 

0 

*3 

— 

k 

m  1 

* 

m. 

0 

X4 

k 

-  m  j 

0 

+ 


0 

0 

l 

m, 

0 


0 

1 

0 

0 


Xl 

*3 

.  x * 

(43) 


y  =  crx  (44) 

where  zi  and  z2  are  the  positions  of  masses  1  and  2,  and 
ra  and  £4  are  the  velocities  of  masses  1  and  2,  respectively. 
We  use  masses  mi  =  m2  =  1  kg  and  spring  coefficient 
k  —  I  N/m  for  this  system 


c  =  6C 

i  c  =  CLQR 

c  =  c* 

«  =  U[,qR(0') 

5.6 

17.1  ; 

16.9 

«  =  «£.qh(c') 

7.3 

13.3 

13.4 

Table  1:  Cost  matrix  for  different  u’s  and  c’s. 


6  Conclusion 

We  presented  a  controller  design  method  for  continuous¬ 
time  plants  whose  uncertain  parameters  in  the  output 
matrix  are  known  to  lie  in  an  ellipsoidal  set.  This  de¬ 
sign  problem  is  posed  as  a  minimax  problem,  where  the 
control  and  plant  uncertainty  can  be  viewed  as  strategies 
employed  by  opposing  players  in  a  game,  in  which  the 
control  is  chosen  to  minimize  the  LQR  cost  and  the  plant 
uncertainty  is  chosen  to  maximize  it.  Without  restricting 
the  form  of  this  minimax  control,  we  proved  that  it  is  the 
LQR  control  for  one  of  the  plants  in  the  ellipsoidal  set. 


the  worst-case  plant.  We  then  proved  that  this  worst-case  l13]  Samuel  Karlin.  Mathematical  methods  and  theory 
plant  always  exists  as  a  fixed  point  for  a  certain  map-  m  games,  programming,  and  economics ,  volume  2. 

ping.  A  simple  heuristic  algorithm  for  computing  this  Addison- Wesley,  Reading,  MA,  1959. 

fixed  point  was  also  given.  [14]  [^0^ert  L.  Kosut,  Ming  Lau,  and  Stephen  Boyd 
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A  Robust  Control  Design  for  FIR  Plants  with 
Parameter  Set  Uncertainty 
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Abstract  This  paper  proposes  a  method  of  computing 
the  finite-horizon  control  inputs  for  FIR  plants  whose  pa¬ 
rameters  are  only  known  to  lie  in  a  set.  The  parameter 
set  is  assumed  to  be  described  by  an  ellipsoidal  bound, 
which  could  be  provided  by  some  identification  scheme 
with  a  parameter  set  estimator.  The  finite-horizon  con¬ 
trol  obtained  minimizes  the  maximum  LQR  cost  from  all 
plants  with  parameters  in  the  given  set.  The  computation 
of  this  robust  control  is  shown  to  be  a  convex  optimiza¬ 
tion  problem,  thus  global  minimization  is  guaranteed  and 
many  efficient  methods  are  available  to  compute  the  min¬ 
imizing  control.  In  addition,  the  method  can  also  be  used 
to  compute  the  control  for  the  dual  problem  in  which  the 
plant  parameters  are  known  but  the  initial  states  of  the 
plant  are  assumed  to  lie  in  a  set. 

1  Introduction 

A  problem  of  great  interest  in  control  theory  is  the  design 
of  a  controller  which  can  guarantee  some  level  of  perfor¬ 
mance  in  the  presence  of  plant  parameter  uncertainty. 
Kharitonov’s  theorem  provides  a  necessary  and  sufficient 
analysis  test  for  determining  the  robust  stability  of  poly¬ 
nomials  with  perturbed  coefficients,  however,  there  are 
few  results  that  exploit  Kharitonov’s  theorem  for  synthe¬ 
sizing  robust  controllers,  e.g.,  [4]  and  [10].  Another  ap¬ 
proach  to  this  problem  is  to  define  a  set  of  nominal  values 
of  the  uncertain  parameters  and  consider  deviations  from 
these  nominal  values.  A  comprehensive  survey  of  the  dif¬ 
ferent  parameter  space  methods,  as  opposed  to  frequency 
domain  methods,  can  be  found  in  [13]. 

Motivated  by  recent  work  from  [11],  [12],  and  [1],  where 
the  identified  plant  parameters  are  described  by  ellip¬ 
soidal  sets,  we  pose  the  following  problem:  given  that 
the  plant  parameters  are  known  to  lie  in  an  ellipsoid,  find 
the  finite-horizon  control  which  minimizes  the  maximum 
LQR  cost  from  all  plants  with  parameters  in  the  given 
set.  At  time  fc,  this  minimization  produces  the  control 
vector  [ti(fc)  +  1)  •  •  •  u(k  +  AT)],  but  only  u(ifc)  is 
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applied.  At  time  ifc  +  1,  a  new  minimization  problem  is 
solved.  This  approach  of  control  application  is  the  same 
as  the  generalized  predictive  control  described  in  [5]  and 
[21- 

In  this  paper,  we  choose  to  work  with  finite  impulse 
response  (FIR)  models  for  the  plant  with  the  assumption 
that  they  are  accurate  models  provided  they  of  sufficient 
lengths.  (In  doing  so,  we  have  also  assumed  that  the 
plant  is  stable.)  Our  goals  are  to  show  that  the  above 
minimization  problem  is  a  convex  optimization  problem 
and  to  design  an  algorithm  to  compute  the  minimizing 
control.  In  order  to  solve  the  minimization  problem,  a 
constrained  maximization  problem  must  also  be  solved. 
The  procedures  of  which  are  given  in  the  Appendix.  We 
will  also  show  that  the  same  algorithm  can  be  used  to 
compute  the  control  for  the  dual  problem  in  which  the 
plant  parameters  are  known  but  the  initial  states  of  the 
plant  are  assumed  to  lie  in  a  set.  The  paper  is  organized  as 
follows,  after  stating  the  problem  in  the  next  section,  we 
will  show  convexity  in  section  3  and  outline  the  algorithm. 
The  dual  problem  of  uncertain  initial  states  is  considered 
in  section  4.  A  numerical  example  is  given  in  section  5. 
Some  concluding  remarks  are  given  in  section  6. 

2  Problem  Statement 

We  shall  consider  a  discrete  FIR  plant 

y(k)  =  &!“(£—  1)  +  ...  4 -6mu(k  — m)  (1) 

=  oTm 

where  y(k)  and  u(k)  are  the  output  and  control  of  the 
plant  at  time  k,  respectively,  and 

0  =  [  6j  62  •  •  •  6m  ]T 

4>(k)  =  [  u(k  -  1)  u(k  —  2)  ...  u(jb_m)]r 

The  parameter  vector  of  the  plant,  0,  is  assumed  to  be  in 
a  set, 

*ee  =  {0:(e-ocfr(o-oc)<  1}  (2) 

where  F  =  Tr  >  0.  Note  that  ©  describes  an  ellipsoid 
in  the  parameter  space  with  its  center  at  0C.  The  matrix 
F  gives  the  size  and  orientation  of  the  ellipsoid,  i.c..  the 
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square  roots  of  the  reciprocals  of  the  eigenvalues  of  F 
are  the  lengths  of  the  semi-axes  of  the  ellipsoid  and  the 
eigenvectors  of  T  are  the  directions  of  the  semi-axes. 

The  plant  in  (1)  can  also  be  represented  in  state  space 
format, 

x(k  +1)  =  Ax(k)  +  bu(k)  (3) 

y(k)  =  cx(k)  (4) 

where 


and 

c  =  [  b,  b2  •  •  •  bm  ]  =  0T 
Thus,  the  states  of  the  FIR  plant  Me 

ac(fc)  =  [  u(k  —  1)  u(k  -  2)  •••  u(k  —  m)  ] T  =  ^(Jfc) 

Due  to  past  disturbances,  the  states  at  some  time  ito  are 
displaced  to  ^(Jfc0)  =  <^o  ^  0,  so  y(k0)  0.  Without  loss 

of  generality,  we  let  Jt0  =  0.  We  now  define  the  control 
and  output  vectors 

u  =  [  ti(0)  u(l)  u( 2)  •  •  •  u(N)  ]T 

y  ^  [  y(0)  y(l)  y( 2)  y(N)  f 

and  the  quadratic  cost  function 


3  Robust  Control  Design 

We  will  solve  the  minimax  problem  of  (6)  by  showing  that 
it  is  a  convex  optimization  problem.  Note  that  since  uTu 
is  not  a  function  of  8,  we  have 

u*  =  argminfJi(u)  +  J2(u)] 

U 

where 

Ji(u)  =  puTu  (8) 

J2(u)  =  max/y  (9) 


We  can  express  y  as 

II 

where 

u(-l) 

u(0) 

u(-2) 

«(-l) 

u(-m) 
u(— m  4- 1) 

U  = 

«(1) 

u(N  -  1) 

n(N  -  2) 

u(N  —  m) 

We  now  state  and  prove  the  following  corollary,  which 
states  that  the  maximizer  of  (9)  always  lies  on  the  bound¬ 
ary  of  0. 

Corollary  1  Let  ||-||2  denote  the  Euclidean  norm,  i.e. 

Ml  =  *T* 

For  a  fixed  matrix  U, 

m  =  \\uo\\i 

is  convex  in  8  and 


Jo  =  puTu  -f  yTy  (5) 

where  p  is  a  weight  to  trade  control  effort  for  regulation. 
The  problem  is  to  find  a  control  which  minimizes  the  cost 
function  for  the  worst  possible  plant  in  ©,  i.e., 

u*  ~  argmin 

Thus,  u*  is  designed  to  be  robust  with  respect  to  the 
parameter  set  uncertainty  given  in  (2).  Note  that  if  there 
were  no  parameter  uncertainty  in  the  plant,  $  —  0C,  then 
(6)  becomes 

=  argmin  J0  (7) 

which  is  the  standard  finite-horizon  linear  quadratic  reg¬ 
ulator  problem.  The  optimal  control  in  (7)  requires  the 
solution  of  the  discrete  Riccati  equation,  which  can  be 
found  in  texts  such  as  [7,  2). 


nt«Mz  =  m|x||^  (10) 

where 

{8:(8-8e)Tr(8-8c)  =  l}  (11) 

Proof  of  Corollary  1  Let  a  G  [0, 1],  then 

/Mi  +  (1  “  <*)h)  -  0/(0,)  -  (1  -  a)f(82) 

=  \\U(a0i  +  (1  -  a)02)\\\  -  a  WUQ^l  -  (1  -  a)  ||W2||’ 
=  — cr(l  —  a)  \\U(9\  —  02)||2 

<  0 

Thus,  f(0)  is  convex  in  8.  Now  let  0,,  02  G  0*,  then 

/  (a 8,  +  (1  -  a)82)  <  af(8 , )  +  ( 1  -  a)f(02 ) 

Since  the  graph  of  /((?)  along  the  line  segment  joining 
any  8,  and  02  lies  on  or  below  the  line  segment  with  its 
ends  at  f(0,)  and  f(02),  (10)  follows.  (A  different  proof 
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of  the  maximum  occurring  on  the  boundary  can  be  found 
in  [14].)  O 

Thus,  the  maximizer  of  (9)  is  given  by 

O'  =  arg  max  ||t/0||2  (12) 


Theorem  1  The  functional 

J(u)  =  J^tr)  +  J2(u)  (13) 

is  convex  in  u. 


Proof  of  Theorem  1  We  express  y  as 

y=ft(0)ti +  *»(«)*.  (14) 

where 

’  0  0  0  .  0  0' 

6!  0  0  .  0  0 

b7  bx  0  •  •  •  •••00 

Bx(0)=  :  '  , 

hm  •••  6i  *•  0  0 

0  ’••  '••  '••  '*•  0  0 

.  •  "•  bm  ^m— l  •••  hi  0  _ 


u 


Figure  1:  J  and  J2  as  functions  of  u. 

following  simple  example.  Consider  the  case  where  m  =  2 
and  N  =  1,  so  6  =  [61  62]T  and  u  =  [u(0)  u(l)]T.  Since 
y(l)  does  not  depend  on  u(l),  we  have  u(l)  =  0  and  can 
consider  u  =  u(0).  Let  0  be  the  set  of  points  which  lie 


on  the  line  segment  from  0\  —  [0.5  —  1]T  to  02  =  [1  1]T, 
and  <t>0  =  [—  1  1]T.  As  shown  in  Corollary  1,  for  a  given 
u,  the  maximum  of  yTy  must  be  at  either  endpoints  of  0, 

j7(u)  =  max  (yTyLSi ,  yTyLffa) 

Figure  1  shows  that  for  this  example,  there  are  two  points 
where  J2(u)  is  not  differentiable.  Also  shown  in  Figure  1 
is  J(u)  with  its  minimum  at  it*  —  —0.4. 


and 

4>o  =  (  u(-l)  w(— 2)  •••  u(-m)  ]T 

Then 

y Ty  =  faBl  B2<i>o  +  2<#  B\  B,u  -f  utB?  Bxn  (15) 

The  first  term  on  the  right-hand  side  of  (15)  is  constant 
in  u,  the  second  term  is  linear  in  u,  and  the  third  term 
uTBf  Bxu  —  Hf^ujlj  is  convex  in  tt  by  Corollary  1.  Thus, 
yTy  is  convex  in  ti  for  each  6  €  0.  Since  the  maximum 
of  a  set  of  convex  functionals  is  also  convex  [3,  page  131], 
J2(u)  is  convex.  By  Corollary  1,  Jj(u)  =  p||«||j  is  convex 
also.  Since  the  sum  of  convex  functionals  is  convex  [3, 
page  131],  J(u)  is  convex  in  u.  □ 


Since  J2(u)  is  not  differentiable  for  all  u,  we  choose  not 
to  use  the  usual  descent  methods  to  find  u*.  Instead,  we 
will  show  that  we  can  easily  compute  a  subgradient  of 
J(u)  and  apply  the  ellipsoid  algorithm  described  in  [3, 
pages  324-332]. 

We  first  give  the  definition  of  a  subgradient.  If  J  : 

— » IR  is  convex,  but  not  necessarily  differentiable, 
then  g  6  IR/'r+I  is  a  subgradient  of  J  at  u0  if 

J(u)  >  J(it„)  +  yT(tt  —  it<,)  for  all  « 

The  set  of  all  subgradients  of  J  at  u„  is  denoted  by 
dJ( it„),  the  subdifferential  of  J  at  u0.  The  following  two 
facts  from  [3,  page  300]  will  be  used. 


With  Theorem  1,  we  are  guaranteed  that  there  is  a 
global  minimum  solution  for  u*  and  many  efficient  meth¬ 
ods  are  available  to  compute  it.  However,  we  want  to 
point  out  that  although  «/2(u)  is  convex  in  u,  it  is  not  dif¬ 
ferentiable  for  all  u.  We  will  illustrate  this  point  with  the 


1.  Since  Jx(u)  and  J2(u)  are  convex,  any  subgradient  of 
the  form  g  =  g\  +  g7  is  in  dJ(u),  where  gx  G  <9Ji(u) 
and  g2  G  <9J2(u), 

2.  Let  yTy  from  (15)  evaluated  at  0 *  from  (12)  be  de- 
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noted  by 

Mu, n  =  4>o  B^(0')B2{0' )4>o  +  WTo  Bj (Q’)Bi(0*)u 
+uTB'[(0')Bl(0')u  (16) 


initial  states  of  the  plant  <f>0  is  assumed  to  be  in  a  set 
similar  to  (2), 

4>0  e  $  =  {4>„  ■■  (4>o  -  4>c)Tr -  4>c)  <  1 }  (is) 


Since  yTy  is  convex  in  u  for  each  0  £  0(,  j;  £ 
dJ2{u,0 *)  implies  y2  €  dJ2(u).  In  the  event  that 
there  are  more  than  one  maximum,  we  only  need  to 
pick  one. 

Thus,  from  (8)  and  (16)  the  subgradient  of  J  at  u  is  given 
by 

g  =  2pu  +  [2Bl(0')B2(0')<f>o+2Bj(0')B1(0')u]  (17) 


The  problem  posed  in  (6)  now  becomes 


u*  =  arg  min  max  Ja 


(19) 


=  arg  min 


puTu  +  max  yry 


Note  that  yTy  from  (15)  is  convex  in  <f>0  for  a  given  u.  This 
means  that  the  maximum  of  yTy  lies  on  the  boundary  of 
$,  $4.  Furthermore,  using  the  same  arguments  from  the 
proof  of  Theorem  1 , 


The  computation  of  8m  is  not  difficult,  but  the  deriva¬ 
tion  is  rather  long.  To  avoid  breaking  the  flow  of  this  sec¬ 
tion,  the  method  of  finding  6"  is  given  in  the  Appendix. 
The  ellipsoid  algorithm  for  computing  u*  G  IR*'  is  as  fol¬ 
lows: 

1.  Select  any  and  Ex  such  that  u*  is  in  the  initial 
ellipsoid, 

u*  e{u:  (u  —  ui)rJEf1(u-  «!)} 

2.  k  0; 

3.  k<-  Jfc  +  1; 

4.  Compute  any  gk  G  dj(uk): 


J^(tt)  =  puTu  +  max  yTy 

^.€4>k 

can  be  shown  to  be  convex  in  tt.  Therefore,  all  we  need 
to  show  is  that  we  can  compute  a  subgradient  of  J^(u), 

g*  =  2  pu  +  2  Bf  Bx  u  +  2Bj  Bril  (20) 

where 

K  =  arg  max  yTy 
From  (14),  we  have 

<t>l  -  arg  max  || Brio  +  BlW||2 

This  is  similar  to  the  form  of  (12)  except  that  we  have 
the  extra  term  Bxu.  Thus,  if  we  solve  for  O'  with 


(a)  Compute  z*  from  Theorem  2; 

(b)  Compute  O'  from  (31); 

(c)  Compute  gk  from  (17); 

5.  Compute  new  ellipsoid: 


9 

uk  +  i 

Ek+l 


9k 


\JalEtgk 
Ekg 


Uk  ~ 


K  +  l 


K2 

K2~  1 


jn 


6-  If  y/gjEtgk  >  c,  go  to  step  3. 


The  stopping  criterion  in  step  6  guarantees  that  on  exit, 
J(ti*)  is  within  c  of  J(u'). 


q  -  -  ( Brie  +  Bxu) 

in  (29)  and  replace  T  and  0C  of  (2)  with  T*  and  <t>c  from 
(18),  we  have 

4>'o  =  0’ 

Therefore,  tz*  in  (19)  can  be  computed  by  the  same  ellip¬ 
soid  algorithm  given  in  Section  4,  where  the  subgradient 
is  now  computed  using  (20). 


5  Numerical  Example 

For  our  example,  we  use  a  10-tap  FIR  plant, m  =  10. 
The  control  vector  u  has  N  =  10,  so  if  u  —  0,  the  output 
will  be  zero  after  10  delays,  y(10)  =  0.  The  parameter 
ellipsoid  ©  in  (2)  is  a  10-dimensional  ball  with  a  radius 
of  5  and  center  at  0C.  6C}  plotted  in  Figure  2  with  the 
'+’  symbol,  is  the  first  ten  terms  of  the  impulse  response 
from  the  transfer  function 


4  Uncertain  Initial  States 


10z(z  +  0.7  cos(7r/4)) 
z2  —  2(0.7)  cos(ir/4)z  -f-  0.72 


.  .  .  .,,  .  ,  ,  ,  The  initial  state  of  the  plant, 

In  this  section,  we  will  consider  the  dual  problem  in  which 

the  parameter  vector  0  of  the  plant  is  known,  but  the  z(0)  =  [u(— 1)  u(  2) 


u(-10)]T 
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Figure  2:  Plant  parameters:  o  -  &i,  x  -  #2,  *  -  63,  and 
+  -0e. 

is  scaled  such  that  l|r(0)||2  =  1. 

Using  p  =  1,  we  will  compare  the  cost  J  in  (13)  as¬ 
sociated  with  three  controls,  ui  =  0,  =  t ilqr,  and 

u3  =  u‘,  where  ulqr  is  given  by  (7)  with  9  =  6C.  The 
controls  ulqr  and  u*  are  plotted  in  Figure  (3),  where 
||u£.Qn||2  =  2.63  and  ||u*||2  =  1.58.  We  now  define  three 
plants  from  0, 

0i  =  argmax(./0  |u=Ui )  i=  1,2,3 

They  are  the  worst-case  plants  for  their  associated  con¬ 
trols  and  are  plotted  in  Figure  2.  Table  1  shows  the  cost 
matrix,  C,  for  the  different  plants  and  controls.  We  make 
the  following  observations  from  C : 

1.  For  i  =  1,  2,  3,  C(t,»)  is  the  largest  in  each  row,  as 
the  0i ’s  are  chosen  that  way. 

2-  ulqr  has  the  lowest  cost  for  0€,  403,  but  only  8% 
lower  than  u*. 

3.  u*  has  the  lowest  maximum  cost,  697,  48%  lower 
than  the  maximum  cost  from  ulqr  and  87%  lower 
than  that  from  u  =  0.  Thus,  the  robust  design  per¬ 
formed  as  expected. 


6  Concluding  Remarks 

Wc  have  shown  in  this  paper  that  given  that  the  FIR. 
plant  parameters  are  known  to  lie  in  an  ellipsoid,  finding 


Figure  3:  Controls:  x  -  ulqr  and  *  -  u*. 


0i 

02 

03 

0C 

o 

II 

5 

1306 

902 

1136 

837 

«2 = ulqr 

587 

1031 

785 

403 

tt3  =  u* 

655 

632 

697 

437 

'j.able  1:  Cost  matrix  for  different  u’s  and  O' s. 


the  finite-horizon  control  to  minimize  the  maximum  LQR 
cost  from  all  plants  with  parameters  in  the  given  set  is 
a  convex  optimization  problem.  An  algorithm  is  given 
to  compute  this  minimizing  control.  Although  the  algo¬ 
rithm  can  also  compute  the  minimizing  control  when  the 
plant  parameters  are  known  but  the  initial  states  of  the 
plant  are  in  an  ellipsoid,  it  would  be  desirable  to  mini¬ 
mize  the  maximum  over  both  parameter  and  initial  state 
uncertainties  simultaneously.  Furthermore,  we  would  like 
to  extend  our  method  to  the  infinite-horizon  case  for  in¬ 
finite  impulse  response  (HR)  plants.  These  are  areas  of 
our  current  research. 


7  Appendix 

Given  the  following  matrices, 


u 

6 

(21) 

r 

6 

]Rmxmi  p  _  fr  >  0 

(22) 

Me 

€ 

IRm 

(23) 

wc  want  to  find  the  maximizer  6’  in  (12).  This  is  similar  to  the  least 
squares  problem  with  quadratic  and  linear  constraints,  which  was 
investigated  in  [8]  and  (9],  Ilowevn .  we  are  seeking  a  maximizer  as 
compared  to  a  minitiiizcr. 
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Since  P  is  symmetric,  we  can  diagonalize  it  by  a  unitary  matrix, 

r  =  t\tt 


Theorem  2  Let  A*  be  the  largest  eigenvalue  of  M ,  then  there  are 
two  possible  eases  for  the  maximizer  of  (SO): 


where  A  is  diagonal  with  eigenvalues  of  T  and  the  columns  of  T  are 
eigenvectors  of  P.  We  now  transform  0(,  in  (11)  to  the  unit  ball. 


B  =  {z:zTz  =  1} 

where 

(24) 

z  =  AsTT(e-ec) 

Substituting 

(25) 

8  =  TA~3z  +  6c 

into  (12),  we  have 

(26) 

z*  =  arg  max  ||Dz  —  q\$ 
xTx= t 

where 

(27) 

D  =  UTA-a 

(28) 

q  =  -UQC 

Define 

n  -  DtD 

0  =  DTq 

then 

(29) 

z*  =  arg  max  zTClz  —  10T z 

ST  X  —  1 

Substituting  z*  into  (26),  8m  in  (12)  is  given  by 

(30) 

fi*  =  TA-az*  +  0C 

(31) 

To  find  x*  in  (30),  we  introduce  the  Lagrange  multiplier  A  and 
adjoin  the  constraint,  zTz  =  1, 

L  =  zTilz  —  2 0Tz  +  A  (l  —  zTz) 

Necessary  conditions  for  the  stationary  points  are 


or 


dL 

dz 

dL 

3A 


2ftz  -  2p  -  2Az  =  0 
1  -  zTz  =  0 


Oz  =3  Az  -f  p 

ZTZ  =  1 


(32) 

(33) 


The  problem  of  finding  all  the  stationary  points  of  such  a  second- 
degree  polynomial  on  the  unit  sphere  was  first  investigated  in  [6], 
but  the  computation  of  the  solution  was  not  considered  there.  A 
proof  similar  to  the  one  given  in  (8] ,  however,  can  be  used  to  show 
the  following: 


Corollary  2  If  (z, ,  A,)  «n<f  (z»,  Aj)  satisfy  (St)  and  (SS)  and 
A}  >  ^2,  Men 

zf  Qzj  -  2 0Tz,  >  zfnzj  -  2f?Tz2  (34) 


Thus,  in  place  of  the  maximization  problem  in  (30) ,  we  need  to 
solve  the  Lagrange  equations  (32)  and  (33)  with 

A  —  maximum  (35) 


In  [9],  it  was  shown  that  (32)  and  (33)  can  be  transformed  to  a 
quadratic  eigenvalue  problem, 

(n  -  A/)2q  =  P0Tr, 

Furthermore,  the  quadratic  eigenvalue  problem  can  be  reduced  to 
an  ordinary  eigenvalue  problem  by  finding  the  eigenvalues  of 


M  = 


n 

-P0T 


-7 

n 


The  solution  of  (30)  is  sum  maxi  zed  in  the  following  theorem: 


t.  If  A*  is  not  an  eigenvalue  of  Q,  Men  z*  =  (Q  —  A m  I) 

2.  If  X*  i#  an  eigenvalue  of  Q,  then  lei  u  =  (O  —  A* /)*/?,  where 
|  denotes  the  pseudoinverse ,  and 

(a)  If  z  =  v  satisfies  (32)  and  (33),  then  z*  —  v. 

(b)  If  z  =  v  satisfies  (32)  and  uT v  <  1,  then  z*  —  u  -f-  £  is 
one  of  many  solutions,  where  £  is  an  eigenvector  to  the 
eigenvalue  X*  of  0  with  =  1  —  u. 


P  roof  a*  x  heorem  2  In  [9}f  the  minimization  of  (30)  was  an¬ 
alyzed.  Due  to  Corollary  2,  we  can  apply  all  the  results  from  (9)  by 
replacing  the  smallest  eigenvalue  of  M  with  the  largest  eigenvalue, 
a 
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Abstract  Precise,  finite-data  statistical  propereties  are  de¬ 
termined  using  a  least-squares  estimator  based  on  an  output 
error  model  with  an  affine  parameter  representation  where  the 
true  system  is  of  output  error  form,  but  is  not  in  the  model  set. 
The  purpose  of  the  analysis  is  to  show  the  effect  of  unmod¬ 
eled  dynamics  on  the  resulting  closed-loop  system  designed 
on  the  basis  of  the  estimated  transfer  function.  This  simple 
problem  set-up  is  prototypical  of  the  interplay  between  system 
identification  and  robust  control  design. 

Introduction 

The  problem  addressed  is  the  following:  given  a  finite  col¬ 
lection  of  sensed  sampled  input/output  data  from  an  unknown 
system,  what  level  of  confidence  can  be  assigned  to  a  feedback 
controller  design  or  modification. 

To  make  the  problem  both  representative  and  analytically 
tractable,  the  following  a  priori  qualitative  data  is  assumed: 

(al)  The  system  which  is  generating  the  data  is  a  discrete 
linear-time-invariant  system  in  output  error  form,  i.e., 

yt  =  (Gu),+et  (1) 

where  t  is  the  sampling  time,  u  and  y  are  the  sensed 
input  and  output  sequences,  respectively,  and  e  is  an  un¬ 
predictable  output  disturbance.  The  operator  G  is  linear- 
time-invariant  with  unknown  transfer  function  G(z)  and 
corresponding  impulse  response  sequence  g.  Thus, 

OO 

(Gu),  (2) 

fc=i 

(a2)  G(z)  is  stable,  i.e.,  all  the  poles  of  G(z)  are  strictly  inside 
the  unit  circle.  Hence,  there  exist  positive  constants  M  > 
1  and  p  <  1  such  that 

\gk\<Mpk~\  V*>  1  i.3) 
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(a3)  The  unpredictable  sequence  e  is  zero-mean  g&ussian  i.i.d. 
with  unknown  variance  Ac 

(a4)  The  input  sequence  u  is  deterministic,  hence,  indepen¬ 
dent  of  e. 

It  is  important  to  emphasize  that  none  of  the  parameters  that 
appear  in  the  above  assumptions  are  assumed  to  be  known; 
they  are  only  known  to  exist.  Hence,  there  is  no  quantitative 
a  priori  knowledge  about  M,  p,  or  A*. 

The  above  qualitative  assumptions  do,  however,  impose 
varying  degrees  of  restrictiveness.  Assumption  (al)  imposes 
an  LTI  structure,  which  by  itself  is  not  necessarily  restrictive, 
however,  the  output  error  form  is  very  specific.  This  latter 
restriction,  together  with  the  gaussian  assumption  (a3)  makes 
the  statistical  analysis  easier  without  resorting  to  a  central 
limit  theorem  or  a  law  of  large  numbers.  Assumption  (a4)  im¬ 
plies  that  the  system  is  operating  in  open-loop,  for  otherwise 
u  would  have  a  component  which  is  correlated  with  e. 

For  control  design  it  is  desireable  to  obtain  an  estimate 
of  G(z).  It  is  standard  practice  to  form  a  parametric  model 
G(z,ff)  and  estimate  the  free  parameter  6.  Although  many 
parametric  forms  are  possible,  e.g.,  [4],  for  ease  of  analysis  we 
choose  the  following  affine  FIR  paramtrization: 

n 

G(z,O)  =  Y,0kZ~k  N) 

*=i 

Thus,  the  problem  is  to  estimate  the  first  n  impulse  response 
coefficients  {<7i,  ■  •  • ,  ffn}.  Although  we  specialize  to  the  FIR 
modeling  case,  all  the  results  apply  mutadis  mutandis  to  any 
other  affine  model  of  G(z),  e.g.,  Laguerre  or  Kautz  models  as 
described  in  (5].  The  essence  of  the  problem  addressed  here 
is,  in  our  opinion,  the  motivation  for  the  work  described  in 
the  recent  special  issue  [6]  on  system  identification  for  robust 
control  design.  In  comparison  with  [2],  the  smoothness  pa¬ 
rameters  M,  p  are  not  estimated  by  modeling  the  tail  of  the 
impulse  response  {q«+i ,  gn+2,  •  •  •}  as  a  random  variable.  Our 
attempt  here  is  to  precisely  determine  the  effect  of  the  un¬ 
modeled  dynamics,  i.e.,  the  tail  of  the  impulse  response,  on  a 
least-squares  parameter  estimator,  without  any  further  prior 
assumptions. 


I 
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Least-Squares  Estimation 

In  this  section  we  use  least-squares  on  the  measured  data  to 
estimate  the  first  n  impulse  response  coefficients  {91, 92, . . .}  in 
(2).  Towards  this  end,  the  unknown  impulse  response  param¬ 
eters  {91 ,  -  ■  • ,  gt }  are  partitioned  into  the  (finite)  parameter 
vector  to  be  estimated. 


When  P  —  0,  it  is  well  known  that  a  and  Xc  are  the  maximum 
likcihood  estimates  of  a  and  Ae,  respectively,  e.g.,  [I],  In  our 
case,  p  yt  0,  and  its  effect  on  the  estimates  is  the  subject  of 
the  next  section. 

Statistical  Analysis 


9 1 


e  ntn 


(5) 


which  consists  of  the  first  n  impulse  response  coefficients,  and 
the  (infinite)  parameter  vector 


ffn+t 

9n+  2 


ent°° 


(6) 


which  is  the  remainder  of  the  impulse  response.  These  param¬ 
eters  -  the  “tail”  of  the  impulse  response,  (9n+i ,  9n+2,  •  •  •} 
-  can  significantly  bias  the  estimate  of  the  “head,”  namely, 
{91, . . . ,  9«}.  Statisticians  refer  to  p  as  a  “nuisance”  parame¬ 
ter.  Note  that  because  G  is  stable,  ||/?||  is  not  only  finite,  but 
decreases  exponentially  as  n  increases.  That  is,  using  (3), 


ll/Jll2  =  £  9l  < 

k=zn+l 


M2p2n 

l -p2 


(?) 


Using  the  definition  ol  a  and  p  together  with  (1)  gives, 

Y  —  Xa  +  XP  -f  E  (8) 

where 


Y  = 

yi 

girn,  e  = 

Cl 

.  »*V 

es 

X  = 

«0 

ttl-n 

gir 

«N- 

1  •  •  ■ 

r  «- 

n  n— 1 

...1 

€  IR"  (9) 


>Nxn 


x  = 


UN— n  — 1  UN— n— 2 


eiRN*”  (11) 


Lo) 


Assuming  that  X'X  G  IRnxn  is  non-singular,  i.e.,  u  is  persis¬ 
tently  exciting  of  order  n,  the  least-squares  estimate  of  a  is 
given  by  the  well  known  formula: 


9 1 


=  arg  min  ||K  -  X6 1|2  =  {X'Xr'X'Y  (12) 

9PIR" 


In  this  section  we  analyze  the  effect  of  the  nuisance  pa¬ 
rameter  P  on  the  estimates  a  and  Ae  of  a  and  Ae,  respec¬ 
tively.  We  use  the  standard  notation  Af  (p,  E)  to  denote  a 
gaussian  distribution  with  mean  p  and  variance  E.  Like¬ 
wise,  y2(m)  denotes  a  chi-squared  distribution  with  m  de¬ 
grees  of  freedom.  Recall  that  if  q  G  Km  is  drawn  from 
Af  (0 ,R)  with  R  non-singular,  then  q‘ R~l  g  G  x2(»ri)-  We 
also  use  x3  (m,  r)  to  denote  a  non-central  chi-squared  distri¬ 
bution  with  m  degrees  of  freedom  and  non-centrality  parame¬ 
ter  r.  To  fix  the  definition  of  the  non-centrality  parameter,  if 
g  G  IRm  is  drawn  from  Af  (p,  R),  then  q' R~*  9  G  X2  (m,  r)  with 
r  =  p’R~lp.  From  [3],  we  also  use:  as  either1  m  or  r  — »  00, 
X3  ("».  r)  -*  Af  (m  +  r,  2<m  +  2r)).  Hence,  x2  (™,  0)  =  x2("») 
and  asm-*  00,  X3(m)  — ►  A/ (m,  2m). 

It  is  convenient  to  define  lie  “covariance”  matrices,2 


Ei>.  = 

±:X'X  G  IR"X" 

N 

(14) 

E12  = 

~ X'X  G  IRnxo° 

N 

(15) 

Ej2  = 

jfX'X  €  ht“xo° 

(le' 

Observe  that  only  En  can  be  formed  from  the  data  and  by 
assumption  is  invertible. 

The  following  theorem  describes  the  distributions  of  the  key 
random  variables. 


Theorem  1  Define  the  parameter  error, 


a  =  a  —  a 

(17) 

and  the  output  error. 

E  =  Y  -Xa 

(18) 

Under  assumptions  (al)-(a4), 

(i)  The  parameter  error  a  and  the  residual  E 
dent  and  normally  distributed  as  follows: 

are  indepen- 

s  6  jv(Er,'&,/».  isr.1) 

(19) 

e  g  Af  (r xp,  Ae  -  r) 

(20) 

where  T  G  given  by, 

r  =  Is  -  X(X'Xyl  A" 

(21) 

where  {9*  |  k  =  1  :  n  }  can  be  thought  of  as  estimates  of 
{9*  |  k  ~  I  :  n  }.  We  also  take  the  estimate  of  Ae,  the  out¬ 
put  error  variance,  as  the  sample-variance, 

A,  =  ~  IP'  -  A'S||2  (13) 


has  rank  N  —  n  and  is  idempotenl ,  i.e.,  T  =  V7 . 

1  It  Ccin  be  shown  that  this  result  is  also  true  if  boJi  m  or  r  — *  oo. 
^Although  the  matrices  S12.E2?  Are  infinite  dimensional,  they 
always  appear  multiplying  f> .  Hence,  these  terms  are  bounded  be¬ 
cause  the  elements  in  0  decay  exponentially* 
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(it)  and  ^-ft'Eua  have  the  following  non-central  chi- 

squared  distributions: 


A'AC 

A„ 

e 

XJ| 

-  n, 

£•) 

\  (22) 

w~,v  ~ 

—  a  End 
A* 

€ 

x2  1 

(*• 

N 

Ae7 

) 

(23) 

where 

3 

=  tf'EbEn 

E|2 

0 

(24) 

6 

=  0  - 

7  = 

■  0’  (E22 

-  E 

12  em 

lZ„)0  (25) 

A*  A  — 

oo, 

A.  -  Af\ 

(Ae 

+ 

2A< 

N 

L(A« 

*f  26 

))  (26) 

S'E 

lid  — ♦  Af  \ 

/  nA 
U V 

-  + 

2Ae  , 

7-  ~FT  ’ 

<n\€ 

V  N 

+  27))  (27) 

The  results  in  part(i)  follow  directly  from  the  underlying  as¬ 
sumptions  and  definitions  of  the  variables,  and  except  for  the 
non-zero  bias  terms,  are  standard,  e.g.,  [1].  Part  (ii)  is  non¬ 
standard,  in  that  the  error  statistics  involve  non-central  chi- 
square  distributions.  These  results  are  obtained  by  direct  ap¬ 
peal  to  the  relation  between  a  normally  distributed  random 
variable  and  the  non-central  chi-squred  statistic  as  stated  in 
the  introduction  to  this  section.  The  asymptotic  results  in 
part(iii)  follow  from  the  asymptotic  normal  approximation  to 
a  non-central  chisquare  distribution  as  stated  in  the  introduc¬ 
tion  to  this  section. 


The  asymptotic  part  of  the  above  '  heorem  leads  to  the  follow¬ 
ing: 


Approximation  2  For  sufficiently  large  N  ,  if  u  is  white,  i.e., 
(30)  holds,  then  with  high  probability: 


Ac  +  Au  PH2 

3n  A* 

~NK 


(34) 

(35) 


Large  N  and  High  Probability 


When  the  input  is  white,  “large  N”  can  be  taken  as, 


2(1  +  2i|)  =  Au  P||a 

’  *  A. 


(36) 


where  tj  is  the  ratio  of  the  energy  in  the  tail  to  the  output 
error  energy.  Typical  values  of  N,  e.g.,  500-1000,  will  always 
be  well  in  excess  of  variations  caused  by  t).  Moreover,  from 
central  and  non-central  chi-square  tables  (e.g.,  [3]),  values  of 
N  >  100  and  n  >  20  make  the  normal  approximations  very 
accurate.  In  consequence,  “high  probability”  is  i,.  excess  of 
99.95%  for  typical  data  lengths  and  model  orders.  Similar 
numbers  hold  for  the  general  case  with  a  non-white  input. 


Frequency  Response  Estimation 


In  part  (iii)  of  the  theorem,  the  asymptotic  variances  decay 
as  1/N.  Hence,  for  sufficiently  large  N,  the  random  variable 
approaches  the  mean  with  high  probability.  This  leads  directly 
to  the  following: 


The  results  of  the  previous  section  can  be  used  to  analyze 
the  errors  in  frequency  response  estimation.  Towards  this  end, 
express  G(z),  the  true  transfer  function  as, 

G(z)  =  D(z)'a  +  D(z)'0  (37) 


Approximation  1  For  sufficiently  large  N ,  the  following  ap¬ 
proximations  hold  with  high  probability, 

A*  «  \c+6  (2R) 

a'Euo  «  •^■Ae  +  7  (29) 

Observe  that  for  large  /V,  the  variance  estimate  Ae  tends  to 
over-estimate  the  true  variance  Ae.  In  addition,  the  errors  a 
and  Ae  —  Ae  are  driven  by  the  “nuisence”  parameter  0,  i.e., 
the  tail  of  the  impulse  response. 

A  special  case  of  interest  is  when  the  input  u  is  white,  i.e., 

Ell  —  A«  •  /n,  El2  —  0)  E22  =  Au  -  loo  (30) 


Theorem  2  If  u  is  white,  i.e.,  (30)  holds,  then: 


—  e  *  ("-“• 

a/“ 

wa) 

(31) 

e  x’(») 

(32) 

In  addition,  at  N  — ♦  00, 

A-  •  Af  (A.  4  A„  ||, f||2  ,  2*f-  (A, 

(  2A„ 

ll/'ll2)) 

(33) 

where 


'  a-1  ‘ 

'  2-(n  +  l)  * 
,-(n  +  2) 

z-n 

,  D(z)  = 

Z  ‘ 

Let  G(z)  denote  the  transfer  function  estimate  of  G(z)  defined 
as 

G(z)  =  D(z)'a  (39) 

where  o  is  the  least-squares  parameter  estimate  from  (12)  of 
the  the  first  n  impulse  response  coefficients  of  G(z).  Let  A(z) 
denote  the  transfer  function  error  defined  as, 

A(z)  =  G(z)~G(z)  (40) 

=  -D(z)'a  +  D(z)'0  (41) 

where 

n  00 

D(z)'a  -  .^2(gk  -  gk)z~k,  D(z)'0  ~  ^  9kZ~k  142) 

fc  =  n+l 

with  o  the  parameter  error  from  (17). 

From  Theorem  1  the  following  result  is  obtained. 
Theorem  3  1'hv  foiloiritiy  vr.iult.i  hold  at  rnch  froqurttcy  w : 


6  3 


(i)  Normal  distribution 


(iii)  Asymptotic  Nc tmalily 
As  N  — »  oo, 


A ^D{e^y^xlHc,u))  («) 


f(z),  =  D(*)'--D(*),i:ri,Eii 


|  A(f  J^)P 

(*fc) 


Af(H  <{w),  2(14  2.(w)))  (52) 


( ii )  Non-central  chi-squared  distribution 


Part  (iii)  together  with  Approximation  2  leads  to: 


1  A(e,w)|2 

frDWEriWe*') 

with  non-centrality  parameter, 


Approximation  4  If  u  is  white,  i.e. ,  (SO)  holds,  then  Jor 
6XJ(hf(w))  (45)  sufficiently  large  N,  the  following  approiimation  holds  with 
high  probability  at  each  frequency  ui: 


e(u)  = 


ia(Oi2~  +  |5(e'“)V|2 


(iii)  Asymptotic  Normality  Robust  Control  Analysis 

As  N  —*  oo, 

^  In  this  section,  we  use  the  asymptotic  frequency  domain 

_ lA(eJ  )| _ *  Jif  (\  4.  r(ui)  2(1+  2e(ai)))  bounds  to  evaluate  controller  robustness.  The  goal  of  control 

Zl(eJ"')*Ef11  D(e>u)  is  to  reduce  the  output  variance.  Consider  the  LTI  feedback 

(47)  controller 

u  =  -Ky  (54) 

Part  (iii)  leads  to  the  following  result.  where  K  stabilizes  the  “estimated”  FIR  system 


Approximation  3  For  sufficiently  large  N ,  the  following  ap¬ 
proximation  holds  with  high  probability  at  each  frequency  io: 


t/  =  Gu  +  e,  G(z)  —  Y'< 


|A(e-,“')|2  to  -jA£>(e',“')*E11I  D(e’“)  +  |F(e-'")'^j2  (48)  Applying  the  control  (54)  to  the  actual  system  (1)  yields  the 

closed-loop  system 


Observe  that  if  u  is  white  (30)  then 


Z?(eJ")’EJ~11  D(e,u)  =  Z?(e'")*  (^-/„)  D(elu) 


=  j-D^)'D{e,w)  =  j- 


T  Q 

y  —  - se — e,  u  = - s — e 

1+QA  1+QA 


f  =  - L Q  =  — 

1 +  GK  1  +  GK 


This  leads  to  the  following:  .  . 

with  A  the  estimation  error  as  denned  in  (40).  bince  the 

nominal  system  is  stable,  it  follows  that  A,  T,  and  Q  are  stable 
Theorem  4  If  u  is  white,  i.e.,  (SO)  holds,  then  at  each  [re-  transfer  functions.  Hence,  the  closed-loop  system  is  stable  if 


qwncy  w: 


and  only  if, 


(t)  Normal  distribution 


(it)  Non-central  chi- squared  distribution 

V  N  A.  ) 

with  non* centrality  parameter 

\n{r’-y(i}! 

,M =  l  " 

\  N  A.  ) 


\\  +  Q(e,“’)A(e,w)|  >  0,  V|w|  <  r  (58) 

If  this  holds,  then  the  spectrum  of  y,  under  closed-loop  - not 
during  identification-  is  given  by: 

flelw)  2 

d»„(w)5=  - 55——  ’ -  (59) 

l4i?(f'“)A(d“) 


Suppose  that  u,  during  identification,  is  white,  i  c.,  (30) 
holds.  To  establish  stability,  observe  that  asufTieienl  condition 
for  stability  is  that, 

|Q(r,u  )|  |A(.  ’“■)!  ■  ! .  VM  V  5  (f.d) 


Using  the  expression  for  |A(e,“)|  in  Approximation  4  aiul  sub¬ 
stituting  for  A*  from  (34).  it  follows  that  for  large  A ,  the 
closed-loop  system  is  stable,  with  high  probability,  if, 


IQ(O)2 


(£-  -  PH2)  +\b(c}-)'pf 


<  1,  V|w|  <  X 


(61) 

Hence,  using  the  large  A  approximations,  with  high  probabil¬ 
ity,  the  output  spectrum  is  bounded  as  follows: 


\n<nf  (a«  -  Au  null2) 

<M«)  <  ^ - L - — r 

(l  -  \Q(*3")\  [t?  (it  -  llflf)  +  1 3(e*w]‘  J 

(62) 

The  only  unknown  quantity  is  0.  Prom  (34),  we  also  know 
with  high  probability  that, 

Ae  ss  A«  —  Au  ||$|j2 

Since  \c  must  be  positive,  it  follows  that 

Pil2  <  A«/Au  (63) 


provides  a  worst-case  upper  bound.  Observe  that  this  bound 
is  known  because  A„  is  the  computed  variance  estimate  and 
Au  is  selected  by  the  user  as  the  input  variance.  As  a  practical 
matter,  it  is  unlikely  that  0  will  achieve  this  bound.  If  it  did, 
then  the  noise  variance  A,  xs  0,  which  for  large  A,  will  almost 
never  occur. 


Using  (3),  we  get 


\D(e,u)' 0\  =  |  °*e',Uk\  *  ~~ 

fc=n+l  P 

Hence,  for  large  A",  the  closed-loop  system  is  stable  with  high 
probability  if, 


IQ(OI2 


3nA*  AfV" 
A  A„  (1  -pY 


<  1,  V|u/|  <  r 


(64) 


The  constants  M  and  p  are  unknown,  so  in  order  to  evalu¬ 
ate  the  above  robustness  condition,  either  we  require  a  priori 
knowledge  or  infer  the  values  from  the  first  n  impulse  response 
coefficients  a'  —  [ji  -  •  •  That  is,  define  the  estimates  M,  p 
via 

15*1  <  Afp*-1,  V*  €  [1,«]  (65) 

and  replace  M,p  with  M ,  p.  This  leads  to  the  robustness  test: 


iQ(cm2 


3n  A,  M2p2" 

A  Au  +  (1  —  p)1 


<  1,  V|u;j  <  jt 


(66) 


Now,  suppose  that  the  closed-loop  system  is  stable  and  the 
above  inequality  holds.  Then  the  spectrum  of  y  is  bounded, 
with  high  probability,  by:  of  y  and  u  are  given,  respectively, 
by: 


4>vM  < 


lf(c'“')|2A<! 


!  -  |Q(n-)| 


(67) 


The  above  bound  gives  an  indication  of  the  trade  between 
bias  and  variance  as  the  model  order  varies  -  all  results  being 
valid  for  data  length  A  >  500  with  probability  in  excess  of 
99.95 


Concluding  Remarks 


Using  an  output  error  linear  plant,  we  have  shown  that 
with  gaussian  noise  and  affine  models,  there  is  a  very  rich 
structure  in  the  analysis  of  standard  least-squares  estimation 
of  the  first  n  impulse  response  coefficients.  The  remaining 
coefficients  bias  the  estimate  in  a  precisely  defined  way  in¬ 
volving  non-central  chi-squared  statistics.  These  appear  to  be 
extremely  useful  in  predicting  model  error  for  robust  control 
design  from  finite  data  records.  Much  still  remains  to  be  done 
even  for  this  restricted  and  analytically  tractable  case,  partic¬ 
ularly  in  finding  a  means  to  bound  the  effect  of  the  bias  (the 
tail  of  the  impulse  response)  without  having  to  perform  addi¬ 
tional  identification  with  ever  larger  parameter  orders.  This 
ultimately  may  involve  additional  o  priori  quantitative  knowl- 
wdge.  We  feel  that  this  paper  indicates  a  first  step  towards 
the  more  difficult  problem  of  model  structures  which  account 
for  non-white  noise,  e.g.,  ARX  or  ARMAX  models. 
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Worst-Case  Control  Design  from  Batch-Lcast-Squarcs  Identification 


Robert  L.  Kosut  Henk  Aung  *f 


Abstract  A  case  study  is  presented  to  support  the  thesis 
that  high  order  models  obtained  from  batch-least-squares  pro¬ 
vide  all  the  necessary  significant  information  for  robust  control 
design. 

1  Introduction 

Suppose  that  the  measured  sampled-data  set 

{y(lu,  j  t  =  1  :  N)  (1) 

has  been  obtained  from  an  unknown  system  where  u  is 
the  scalar  control  input  and  y  is  the  scalar  sensed  output. 
Suppose  further  that  it  is  known  a  priori  that  the  system 
which  generated  the  data  is  stable,  linear-time-invariant 
(lti),  and  operating  in  open-loop.  Hence, 

y  =  Gu  +  u  (2) 

where  G  is  an  lti  operator  with  unknown  transfer  function 
G(z).  The  output  disturbance  vt  is  known  to  be  a  zero- 
mean  sequence  with  the  unknown  spectrum  4>u(w).  The 
problem  is  to  use  the  measured  data  set  (1)  together  with 
the  a  priori  information  to  obtain  estimates  G(z)  and 
d>„(w)  which  can  be  used  for  control  design. 

To  see  the  control  problem  more  clearly,  suppose  some¬ 
how  we  have  obtained  estimates  G(z)  and  $„(w).  The 
next  step  is  to  design  a  feedback  controller.  Let  the  con¬ 
trol  be 

u  =  -Ky  (3) 

where  I\  is  lti  with  transfer  function  K{z).  Typically  the 
controller  I\  is  designed  for  the  estimated  system 


were  known,  then  a  robust  controller  could  be  designed. 
Various  approaches  have  been  put  forth  to  resolve  this 
problem,  e.g.,  [3],  but  these  involve  forms  of  prior  in¬ 
formation  and/or  approximations  which  are  either  very 
difficult  to  obtain  or  are  too  coarse.  For  control  design, 
the  error  needs  to  be  well  known  near  the  desired  band¬ 
width  of  the  closed-loop  system,  which  may  not  be  known 
beforehand.  Hence,  prior  information  on  the  impulse  re¬ 
sponse,  such  as  magnitude  and  rate  of  decay,  is  unlikely  to 
contribute  significantly  to  a  useable  estimate  of  the  error 
near  the  desired  bandwidth  because  the  impulse  response 
bound  provides  only  very  low  frequency  and  very  high  fre¬ 
quency  information.  Ironically,  any  precise  information 
about  the  system  dynamics  near  the  desired  bandwidth 
is  likely  to  preclude  the  need  for  identification. 

In  this  paper  we  propose  the  thesis  that  high  order 
models  obtained  from  batch-least-squares  provide  all  the 
necessary  significant  information  for  robust  control  de¬ 
sign  without  invoking  additional  prior  information.  A 
case  study  is  presented  which  (of  course)  supports  the 
thesis. 


2  Batch-Least-Squares 


Perhaps  the  most  widely  used  procedure  for  obtaining  the 
estimates  is  via  batch-least-squares  where: 


G(z)  = 


B{z)  _  6]Z  1  + - h  bnz~n 

A(z)  1  +  Siz-1  + - h  anz~n 


<E>„M  =  — 

\A(e>“W 


(5) 

(6) 


y-Gu  +  v,  spectrum{u}  =  $v(w)  (4) 

The  problem  is  that  the  estimated  system  differs  from  the 
true  system  (2)  and  hence,  predicted  performance,  based 
on  the  estimate,  may  not  at  all  be  like  the  performance 
actualized  when  the  controller  is  applied.  If  a  bound  on 
the  model  error  between  the  true  and  estimated  systems 

'The  Authors  arc  with  Integrated  Systems,  Inc.,  3260  Jay  Street, 
Santa  Clara,  CA  0505-1. 

I  Research  supported  by  A  FOSIt/Directoratc  of  Mathematical 
Scienc  es  contrac  t  I'  I0(>20-K0  C-0119,  and  NSF  Grant  ISI-916M08. 


*  =  ^  X!  [(*y  -  £«)<]  (7) 

t=i 

where 

^  ,  * 

A(z),  B(z)  =  arg  min  —  £  [{Ay  -  Du),?  (8) 

The  number  n  will  be  referred  to  as  the  model  order.  (Ac¬ 
tually  the  numerator  and  denominator  orders  need  not 
be  the  same  as  shown  here  )  in  every  practical  situation 
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there  is  no  finite  value  of  n  for  which  the  right  hand  side 
in  (8)  is  zero.  That  is,  the  true  system  is  not  in  the  model 
set  -  an  axiom. 

The  great  appeal  of  “least-squares,”  and  the  principle 
reason  for  its  ubiquity,  is  that,  provided  the  input  is  suffi¬ 
ciently  rich  in  spectral  content,  a  unique  minimum  of  (8) 
is  always  obtained.  Furthermore,  there  are  very  efficient 
and  reliable  methods  for  computing  the  solution,  typically 
involving  square-root  calculations  such  as  the  QR  trans¬ 
formation  as  well  as  lattice  forms  for  very  high  model 
orders.  It  is  imperative  that  the  calculations  are  done 
in  this  manner,  for  otherwise  significant  numerical  errors 
will  accrue,  even  for  a  small  number  of  parameters.  There 
are  other  reasons  as  well  for  using  a  QR  method,  e.g.,  (1) 
high  model  orders  and  large  amounts  of  data  are  easily 
handled,  (2)  data  from  different  experiments  are  readily 
combined  without  re-doing  the  entire  estimation,  and  (3) 
prediction  errors  can  be  computed  for  varying  model  or¬ 
ders  directly  from  the  QR  transformation.  These  factors 
make  it  possible  to  easily  and  rapidly  generate  extremely 
high  order  models  from  large  amounts  of  data.  This  fa¬ 
cility  in  turn  provides  a  great  deal  of  information  about 
the  true  system, 


3  Robust  Control  Design 

Using  the  control  (3)  on  the  plant  (2)  yields  the  closed- 
loop  system 


y  =  Tyuv 
u  -  —  TuvV 


(9) 


where 


Tvv  = 


1 


Tuv  — 


I< 


1  +  GI<  '  uv  1  +  GI< 

To  arrive  at  an  expression  involving  the  estimates,  let 


(10) 


e  =  Ay  —  Bu 


(ID 


denote  the  prediction  error  after  identification.  Using  the 
plant  description  (2)  gives  the  equivalent  expression  for 
e: 


£  —  w  +  A  u 


w  =  Av  (12) 

A  =  AG -B 


and 


U 

1  +  GK 


Tut 


H  K 
1  +  GI( 


(15) 


Since  K  is  designed  for  the  estimated  system,  it  follows 
that  Tyc  and  Tu(  are  stable.  Hence,  I(  stabilizes  the  true 
system  if  and  only  if  (l  +  ATut)-1  is  stable.  Because  both 
Tut  and  A  are  stable,  K  stabilizes  the  true  system  if  and 
only  if 


|l  +  A(^)fut(eJU’)|^0,  VM<*  (16) 


The  well  known  condition  for  robust  stability  [1],  and 
sufficient  for  (16),  is  that  the  loop-gain  be  less  than  one, 

\Tue(^)A(^)\  <  l,  VM<*  (17) 

To  verify  either  (16)  or  (17)  requires  some  means  of  es¬ 
timating  A(e^w)  or  a  bound  on  |A(eJ<4')|.  In  addition,  to 
predict  closed-loop  performance  requires  producing  an  es¬ 
timate  of  $w(u),  the  sprectrum  of  w  as  defined  in  (12). 
Estimates  of  both  can  be  obtained  using  standard  spec¬ 
tral  methods  (Ch.6,[2])  as  follows: 

(«)/*»(«) 

<MW)  =  $t(u>)  -  |d>t„(w)|2/4>u(w) 

All  the  <t>  variables  are  generated  from  the 
identification  data  set: 

{£(.u,  |  t  =  1  :  Af} 

It  is  important  to  mention  that  spectral  estimation  tech¬ 
niques  also  introduce  errors.  How  the  spectral  estimate 
varies  from  the  true  is  not  known  precisely  although 
asymptotic  results  are  avaialble  [2].  These  are  similar 
to  asymptotic  results  for  estimating  model  error  from 
batch-least-squares.  Unfortunately,  precise  conditions  for 
which  the  asymptotic  results  are  good  approximations  are 
not  known  without  invoking  additional  prior  information, 
which  we  argue  may  not  be  obtainable  in  practice.  For 
this  reason  we  precede  heuristically,  and  simply  utilize 
the  spectral  estimates. 

Based  on  these  estimates,  we  obtain  the  following  ap¬ 
proximations  to  the  closed-loop  rms  values 


(18) 

post- 

US) 


As  a  result  the  closed-loop  system  is  equivalently:  rms(y) 


y  =  TyWW 
U  — *  Tuu,  tv 


(13) 


rms(u) 


where 


where 


£  fl.  |r-.(c>“)|a  *«,(»)du, 


(20) 


fjc 


1  t  AT,,, 


l  u  t, 


7' 


\  AT,,, 


(M) 


l ,,  »■ 


Ty< 

1  I  AT., 


I  \tw  — 


7’,,, 


1  -f  AT„, 


(2T 
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are  estimates  of  the  actualjdosed-loop  transfer  functions 
in  (14).  Following  (17),  Tyw  and  Tuw  are  stable  if  the 
estimated  loop-gain  satisfies, 

|7«,(eJU')A(eJ“)|  <  1,  V|w|  <  x  (22) 

Since  A(e^w)  is  an  approximation  to  A(eJU’),  satisfaction 
of  (22)  does  not  imply  that  (17)  holds.  Moreover,  since 
(17)  is  sufficient  to  insure  (16),  failure  of  (22)  to  hold  does 
not  imply  instability,  but  certainly  merits  some  caution 
at  those  frequencies  where  the  test  fails. 

4  Case  Study 

Simulated  System  The  computer  simulated  (true) 
system  is  the  mass-spring-damper  mechanism  shown  in 
figure  1,  where  u  is  the  control  force  and  d  is  an  exoge¬ 
nous  disturbance  force.  The  unknown  disturbance  d  is  a 
zero-mean  random  sequence  with  variance  (.001)2.  The 
user  applies  a  zero-mean  sequence  with  unit  variance  as 
the  control  input  for  identification.  The  data  set  is  stored 
for  1024  samples.  The  sampling  frequency  for  both  con¬ 
trol  and  sensing  is  10  hz. 

Model  Order  Selection  Batch-least-squares  esti¬ 
mates  are  computed  using  MATRIX^  from  the  data  set 
{y<,  ut  j  t  =  1  :  512},  thus,  N  ~  512  in  (1).  The  remaining 
data  set,  {yt,u«  |  t  =  513  :  1024},  is  used  for  validation. 
Figure  2  shows  the  normalized  rms  values  of  the  predition 
error  for  model  orders  from  1  to  60  on  both  the  identifi¬ 
cation  and  validation  data  sets.  Using  the  identification 
data,  the  rms  values  continually  decrease  as  the  order  in¬ 
creases,  which  is  to  be  expected  because  after  a  certain 
point  the  model  is  fitting  noise.  This  is  verified  using 
the  validation  data  (lower  bar  plot  of  figure  2)  where  the 
rms  values  actual  begin  to  slightly  increase  with  increas¬ 
ing  model  order.  Thus,  beyond  the  range  from  n  =  10 
to  n  =  20,  no  new  information  is  really  obtained  in  the 
identification.  Hence,  the  “optimal”  model  order  from 
this  data  set  is  in  this  range.  This  phenomena  can  also 
be  seen  by  examing  the  prediction  error  time  series,  shown 
in  figure  3  for  model  orders  n  =  4, 16, 60. 

In  figure  3,  as  the  model  order  increases  from  4  to 
16,  the  variation  (rms)  of  the  error  decreases.  However, 
increasing  the  order  from  16  to  60  decreases  the  error 
o^er  the  identification  samples  (t  =  1  :  512),  whereas 
the  error  slightly  increases  over  the  validation  samples 
(<  =  513  :  1024).  To  emphasize  the  effect  of  noise  fitting, 
we  repeated  the  experiment  with  the  shorter  identifica¬ 
tion  set  {y,,u<  |  £  =  1  :  256}.  The  results  are  shown  in 
figures  4-5.  Now  the  rms  of  the  prediction  error  using  the 
validation  data  increases  more  sharply  for  n  >  12,  and 


the  graph  of  c(t)  for  n  =  60  shows  a  significantly  smaller 
variation  over  the  identification  samples  t  =  1  :  256. 

Control  Design  Based  on  the  above  results  it  seems 
reasonable  to  select  a  design  model  with  an  order  in  the 
range  10  <  n  <  20.  For  illustrative  purposes  here  we 
select  three  values  n  =  4,16,40.  Figure  6  shows  the 
magnitude  and  phase  of  the  frequency  responses  of  the 
true  system  G(e^'1>)  and  the  three  estimates  G„(eJU')  cor¬ 
responding  to  n  =  4,16,40.  As  suggested  by  the  rms 
plots  in  figure  2,  the  largest  errors  occur  for  n  =  4,  and 
for  n  =  40,  the  estimates  are  “noisy.”  Similarly,  figure 
7  shows  the  true  spectrum  $,,(u>)  and  the  three  spectral 
estimates  4V|n(<i/)  corresponding  to  n  =  4, 16,40. 

To  evaluate  the  efficacy  of  the  closed-loop  rms  appoxi- 
mationsin  (20),  a  set  of  LQG  controllers  were  designed  for 
each  of  the  plant  models  as  follows.  For  each  n  =  4, 16,40, 
the  observer  was  based  on  the  model, 

Any  =  Bnu  +  e, 

where  e,  is  taken  as  a  white  noise  with  unit  variance.  The 
regulator  is  then  designed  to  minimize  the  expected  value 
of  52Jlj[y2  +  (pu,)2]  f°r  control  weights  p  —  10, 1,  .1,  .01. 
Thus,  we  obtain  the  family  of  12  controllers, 

«  =  ~KUiPy 

Figure  8  shows  the  predicted  and  actual  performance 
tradeoff  between  rms(y)  and  rms(u).  Observe  that  for 
n  =  16,40,  the  predicted  performance  is  very  similar  to 
the  actual  performance,  whereas  for  n  =  4,  the  actual  per¬ 
formance  is  significantly  better  than  the  predicted.  Recall 
that  n  =  16  is  considered  to  be  an  optimal  choice  based 
on  the  cross  validation  plots  in  figure  2.  Beyond  n  =  16, 
no  significant  performance  increases  were  observed. 

The  performance  tradeoff  of  the  different  controllers 
is  not  at  all  complete  by  just  examining  figure  8.  This 
does  not  show  the  robustness  properties  of  the  differ¬ 
ent  controllers.  For  n  =  16,  figure  9  shows  the  esti¬ 
mated  loop-gain  |Tue(e-,w)A(e;u')|  and  the  actual  loop- 
gain  |7’er(eJ“')A(e;“)|  for  two  of  the  12  LQG  designs, 
namely,  for  model  order  n  =  16  with  control  weights 
p  —  1,  .01.  For  the  smaller  weight,  p  =  .01,  there  is 
no  robustness  guarantee  because  the  estimated  loop  gain 
is  greater  than  1  at  some  high  frequencies.  However,  the 
actual  loop  gain  remains  less  than  1  for  all  frequencies. 
In  addition,  in  every  other  case  (not  shown  here),  the  es¬ 
timated  loop-gain  was  always  larger  (more  conservative) 
than  the  actual  gain. 
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Figure  1:  Mass-spring  system  with  M i  =  1  ,Mj  = 
.25, /Ci  =  6 ,  JC2  =  .125,  Di  =  .3,  D2  =  .05.  The  sensor 
reads  y  =  100x2- 
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Figure  3:  Prediction  error  e(t)  for  n  =  4, 16,60;  identifi¬ 
cation  from  samples  t  =  1  :  512,  validation  from  samples 
l  =  513  :  1024. 
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Figure  2  Normalized  RMS  prediction  error  vs.  model  or-  l'igm  Normalized  \\  MS  prediction  error  vs.  model  <r 
der  Identification  for  t  =  1  512,  validation  for  1=513:1024.  der.  Id.  ntificalion  for  I  - !  :256.  validation  for  t"5l3  102  1 
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System  Identification  for  Robust  Control  Design 

^  Robert  L.  Kosut 


Abstract  Some  recent  results  are  summarized  in  parame¬ 
ter  set  estimation  for  linear-time-invariant  systems.  The  ex¬ 
tension  to  nonlinear  uncertain  systems  is  explored  and  some 
preliminary  results  are  presented.  Robust  control  design  re¬ 
quirements  are  also  discussed. 


1  Introduction 


There  are  many  ways  to  design  or  configure  an  adaptive  con¬ 
trol  system.  Figure  1  depicts  the  self-tuning-regulator  (STR) 
configuration  [2].  Two  feedback  processes  make  it  adaptive, 
namely:  (i)  a  model  parameter  estimator,  and  (ii)  a  control 
design  rule. 


Model  Parameters  8 


Figure  1:  Self  Tuning  Regulator  (STR). 

The  parameter  estimator  operates  on  the  input-output  data 
obtained  from  measurements  (y,  u)  of  the  plant  system  and 
produces  a  model  parameter  estimate  8  €  IRP.  The  param¬ 
eter  estimate  is  transformed  by  the  control  design  rule  into 
a  controller  parameter  p  €  IR*,  which  is  then  used  in  a  pre¬ 
determined  parametric  controller  structure  in  feedback  with 
the  actual  system. 

It  is  obviously  very  easy  to  construct  an  adaptive  system: 
just  connect  a  control  design  rule  and  an  estimator  together. 
However,  it  is  very  difficult  to  insure  that  the  resulting  adap¬ 
tive  system  will  provide  acceptable  performance.  This  has 
been  the  goal  of  research  in  this  area  for  over  30  years. 

Roughly,  if  the  true  system  is  in  the  model  set  which  un¬ 
derlies  the  parameter  estimator,  then  the  adaptive  system  will 
asymptotically  reduce  the  error  signal  for  arbitrary  bounded 
exogenous  inputs  (r,  d).  Technically  it  is  necessary  that  a  a 
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certain  (closed-loop)  transfer  function  is  strictly-poitive-real 
(SPR)  [18,10,1],  e.g.  ,  H(s)  is  SPR  if  it  is  stable  and  satisfies, 

Re[if(jw)]  >  0, Vo;  (1) 

The  main  difficulty,  to  put  it  simply,  is  that  the  true  system 
is  neuer  in  the  model  set  -  there  are  always  dynamical  phe¬ 
nomena  which  remain  unaccounted  -  and  unfortunately,  the 
SPR  condition  fails  to  hold.  Moreover,  the  theory  based  on 
this  property  is  sufficient  and  hence  does  not  predict  what  will 
happen  if  the  SPR  condition  is  violated. 

Under  sufficiently  slow  adaptation  the  method  of  averaging 
can  be  applied  to  expose  a  mechanism  for  stability  and  in¬ 
stability  [l],[21],[2].  This  theory  replaces  the  above  SPR  con¬ 
dition  with  a  “signal  dependent  positivity  condition”  of  the 
form, 

R  =  J  Re[f/(jw)]S(w)  du>  >  0  (2) 

where  S(w)  >  0  is  a  spectral  density  matrix  associated  with 
the  exogenous  inputs.  This  condition  is  much  less  restrictive 
because  even  if  H{jw)  fails  to  satisfy  the  SPR  condition  (1) 
at  some  frequencies,  (2)  can  still  hold  provided  the  excitation 
is  concentrated  at  those  frequencies  where  Re[ff(jw)]  >  0. 
Moreover,  if  any  eigenvalue  of  R  is  negative  then  the  system 
is  unstable.  In  using  the  theory  for  design,  the  user  must 
select  an  appropriate  combination  of  data  filtering  and  excita¬ 
tion  spectrum.  This  task  is  similar  to  problems  encountered 
in  system  identification  [17]  except  that  here  the  system  be¬ 
ing  identified  is  in  closed-loop,  which  vastly  complicates~the 
selction  process. 

To  see  this  more  clearly,  consider  the  function  F(0)  defined 
via  Figure  2,  i.e.  ,  for  every  parameter  choice  6  there  is  a 
resulting  parameter  estimate  denoted  by  the  function  F(0). 


Figure  2:  Illustration  of  the  parameter  map  F. 

It  is  shown  in  [19]  that  under  slow  adaptation,  convergence 
points  of  the  STR  system  in  Figure  1  arc  precisely  the  fixed- 
points  of  Moreover,  the  fixed-point  is  stable  if  (2)  holds  and  is 
unstable  otherwise. 
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In  summary,  the  averaging  result  shows  that  stability  of  the 
(nonlinear)  adaptive  system  can  be  deduced  from  a  frequency 
domain  condition  (2)  which  mixes  signals  and  systems.  How¬ 
ever,  there  are  some  difficulties  in  utilizing  the  theory.  In  the 
first  place,  it  is  no  trivial  task  to  determine  the  fixed  point(s) 
of  the  map  F,  «.e.  ,  those  6  £  IRP  satisfying  6  =  F(0).  Sec¬ 
ondly,  both  the  transfer  function  // (s)  and  the  spectrum  S(w) 
depend  in  a  complicated  manner  on  the  fixed-point  and  it 
is  unclear  how  to  precisely  manipulate  data  filters  and  input 
spectrum  to  acheive  either  a  satisfactory  fixed-point  and/or  a 
satsfactory  transient  response  in  the  adaptive  parameter  tra¬ 
jectory.  To  put  it  bluntly,  the  theory  fails  to  produce  a  “user 
friendly”  design  method. 

If  we  agree  that  the  fundamantal  difficulty  in  analyzing  the 
adaptive  system  is  the  ubiquitous  model  uncertainty,  then  one 
alternate  approach  is  to  configure  an  adaptive  control  sys¬ 
tem  which  specifically  accounts  for  the  uncertainty.  One  such 
scheme,  depicted  in  Figure  3,  replaces  the  parameter  estimator 
in  Figure  1  with  an  estimator  that  produces  a  model  set  or  set 
of  uncertainty.  This  would  avoid  the  major  obstacle,  namely, 
that  the  true  system  is  not  in  the  model  set  used  for  identifi¬ 
cation.  This  type  of  estimator  is  referred  to  as  an  uncertainty 
estimator  or  a  set  estimator.  This  differs  from  the  estimator 
in  the  usual  adaptive  schemes  (cf.  Figure  1),  where  a  single 
estimated  model  is  produced,  with  no  information  regarding 
its  accuracy. 


forts,  e.g.  ,  see  [3,4]  and  the  references  therein. 

li  the  remainder  of  the  paper  we  principally  address  set 
estimation  for  linear  and  nonlinear  systems.  Section  2  pro¬ 
vides  a  review  of  some  recent  results  in  linear  set  estimation 
and  some  new  results  in  nonlinear  set  estimation.  Section  3 
provides  a  brief  section  on  linear  robust  control  of  plants  with 
both  parametric  and  nonlinear  uncertainty  set  descriptions. 


2  Set  Estimation 

Set  estimators  should  at  least  have  the  following  features: 

•  Uncertain  Parameters.  A  capability  to  account  for  that 
part  of  the  system  which  is  known  to  be  governed  by 
physical  laws  or  able  to  be  described  by  known  func¬ 
tions  dependent  on  certain  constant  parameters.  The 
parameters  may  only  be  known  to  lie  within  some  range 
of  variation. 

•  Uncertain  Dynamics.  Able  to  account  for  uncertain  dy¬ 
namics  for  which  a  parametric  structure  is  not  avail¬ 
able  or  assumed,  e.g.  ,  neglected  high  frequency  flexible 
modes,  uncertain  memoryless  nonlinearities,  etc.. 


Model  Set 


Figure  3:  Adaptive  control  with  uncertainty  estimation. 

The  second  change  is  to  use  a  robust  control  design  rule,  f.e.  , 
one  that  accepts  a  model  set  in  the  form  produced  by  the  set 
estimator.  Under  these  conditions,  if  the  true  system  which 
generated  the  measured  data  is  contained  in  the  estimated 
set,  then  the  adaptive  system  is  not  only  stable,  but  acheives 
the  maximum  performance  possible  given  the  estimated  set  of 
uncertainty. 

Preceding  in  this  way  we  have  transformed  the  problem  of 
adaptive  control  design  from  analysis  with  trial-and-error  into 
separate  synthesis  problems  in  set  estimation  and  robust  con¬ 
trol  design.  In  effect  this  is  a  “separation  principal”  analogous 
to  that  in  the  LQG  design. 


2.1  Linear  Set  Estimation 

Consider  the  linear-time-invariant  model  set: 

S(0,  W)  =  {C„(l  +  AW)  :  6  6  0,  ||  A||oo  <  1}  (3) 

The  set  5(0,  IF)  describes  both  parametric  and  nonparamet- 
ric  uncertainty.  The  parametric  uncertainty  is  reflected  in  the 
set  {Ge  '■  6  G  0}  where  Ge  is  a  parametric  transfer  function 
with  uncertain  parameters  SgGc  IRP.  The  mapping  6  — ►  Ge 
is  known  but  the  exact  parameter  values  are  known  only  to  be 
in  some  set  0.  The  nonparametric  uncertainty  is  reflected  in 
the  set  {A  :  ||Aj|oo  S  !}•  Thus  A  is  an  uncertain  linear-time- 
invariant  system  only  known  to  be  stable  and  unity  bounded 
in  the  2f  co-norm,  which  for  continuous  time  systems  is  defined 
as  ||A||oo  =  supw€IR  |A(ju>)|  and  for  discrete  time  systems  as 
||Aj|oo  =  sup|w|<lr  |A(eJ")|.  IF  is  a  stable  transfer  function 
which  reflects  tfre  size  of  the  relative  (or  multiplicative)  un¬ 
certainty,  i.e.  , 

||A|lc0  =  ||%^||00<i 

The  above  expression  suggests  interpreting  the  set  5(0,  IF) 
as  a  set  of  transfer  functions  “centered”  at  the  parametric 
transfer  function  Ge  with  a  “radius  of  uncertainty”  of  GgW. 


At  present,  methodologies  for  the  design  of  set  estimators 
are  under  development,  e.g.  ,  [23],  [12],  [16],  [13], [9], [14],  [24]. 
On  the  other  hand,  there  is  a  reasonable  maturity  of  method¬ 
ologies  for  robust  control  design,  particularly  for  plants  with 
uncertain  nonparametric  linear  dynamics,  e.g.  ,  [20],  [25],  [5,6], 
[8],  (22).  Robust  control  design  of  plants  with  parametric  un¬ 
certainty  seems  still  underdeveloped  despite  some  heroic  ef- 


It  is  usually  possible  in  a  modeling  process  to  arrive  at  an 
initial  parameter  set  0o  and  a  weighting  transfer  function  Wo. 
In  the  case  when  the  prior  set  5(0o,  IF0)  is  too  coarse  to  lead 
to  tolerable  closed-loop  performance  levels,  then  a  model  set 
estimator  is  required  to  refine  the  prior  information  by  making 
use  of  measured  data.  Specifically,  we  extract  the  following 
result  from  [14]. 
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Theorem 


Suppose  that  the  the  measured  data  set 
{y,u  :  t  -  1, 

is  obtained  from  the  sampled-data  system 
y  —  Gu 

where  G  has  the  discrete-time  transfer  function 
G(z).  Furthermore,  suppose  that  from  prior  «n- 
formation 

GeC(0o,w o) 

and  the  parametric  transfer  function  in  (3)  has 
the  structure: 

C  (  \  —  _  El*-1  d - +  bmz~m 

Ag(z)  1  +  H - f anz-n 

0T  =  [«i  •■■«»  Ji  •••  6m] 

Under  these  conditions,  if  G  is  initially  at  rest, 
and  is  either  stable  or  in  a  stabilizing  feedback, 
then: 


G  G  5(00,  Wo)  n  5(©N,  Wo)  (4) 
where  0jv  is  the  parameter  set  estimate, 

On  =  {0  :  H-Aay  —  <  ||jBeW0ti||Ar}  (5) 

with  the  N -point  signal  norm 

IN2n  =  EL  *«2. 

The  above  result  implies  that  if  the  true  parametric  transfer 
function  is  Geltn,,  where  0t,ue  €  IR.1’  is  the  true  parameter 
value,  then 

0«u«  €  0o  n  Of/ 

A  good  data  set  would  insure  that  the  new  set  estimate  is 
strictly  inside  the  prior  estimate,  that  is 

0o  n  On  C  0o 

Since  both  Ag  and  Bg  are  affine  functions  of  9,  it  can  be 
shown  [14]  that  0jv  describes  either  an  ellipsoid  or  an  hyper¬ 
boloid  in  IRP,  depending  on  the  data.  Moreover,  although 
the  set  0o  D  On  is  not  an  ellipsoid,  nonetheless  a  bounding 
ellipsoid  can  be  obtained. 

A  similar  result  is  obtained  in  [24]  for  a  co-prime  factor  non- 
parametric  uncertainty  structure  rather  than  the  multiplica¬ 
tive  one  used  here.  More  on  bounding  ellipsoids  can  be  found 
in  [7]  who  considered  the  problem  of  parameter  set  estimation 
with  bounded  noise  and  no  unmodeled  dynamics. 


2.2  Nonlinear  Set  Estimation 

The  preceding  principals  of  set  estimation  for  linear-time- 
invariant  systems  can  be  applied  to  the  set  estimation  of  non¬ 
linear  systems.  We  will  illustrate  the  problems  using  the  fol¬ 
lowing  three  example  systems:  (i)  an  input  nonlionearity,  (ii) 
an  output  nonlinearity,  and  (iii)  a  mechanical  system  with 
backlash . 


Example  1:  Input  Nonlinearity 

Consider  the  system  shown  in  Figure  4  and  described  by: 

y  =  Ggu,  u  =  /( u)  (6) 


Figure  4:  Input  nonlinearity. 


Make  the  following  assumptions: 


1.  The  function  /(•)  is  a  memoryless  time-invariant  non¬ 
linearity  known  to  lie  in  the  sector 

|/(u)  -  ku |  <  6|u|,  V|u|  <  p  (7) 

where  6  <  k  and  p  >  0  are  known  constants. 

2.  Gg  is  a  continuous-time  linear-time-invariant  system 
with  stable  transfer  function  Gg(s)  and  where  9  G  IRP 
are  uncertain  parameters. 

3.  The  measured  data  set  is 


{y(0,**(0  :  t  =  l,.,.,N) 


where  the  time  t  is  normalized  to  the  sampling  interval. 


The  constants  ( k,S,p )  quantify  the  uncertainty  in  the  non¬ 
linear  function  /(•)  in  much  the  same  way  that  W  bounds 
the  uncertain  linear-time-invariant  nonparametric  dynamics 
in  the  previous  section.  A  problem  here,  though,  is  that  u, 
the  input  to  the  linear  part  of  the  system,  is  not  a  measured 
variable.  Moreover,  the  nonlinear  function  precludes  describ¬ 
ing  any  discrete-time  transfer  function  from  u  into  y.  However, 
provided  /(•)  is  sufficiently  smooth,  for  fast  sampling  we  have 
the  following  sampled-data  approximation 

y  w  Ggu,  ii  =  /(u)  (8) 

where  now  Gg(s)  is  approximated  by  the  zero-order-hold  z- 
transform 

Gg(z)  =  {l-z-')Z{11Gg(s)} 

This  approximation  is  only  valid  at  the  sample  times  t  G 
{1  For  example,  if  /{•)  is  a  polynomial  or  rational 

function,  then  there  certainly  exists  a  (not  necessarily  small) 
region  |u|  <  p  such  that  (7)  holds. 

To  illustrate  the  problems  in  obtaining  a  set  estimator  even 
for  the  approximate  system  (8),  suppose  that  ( k,8,p )  are 
known,  and  we  wish  to  estimate  a  parametric  model  for  G*(z). 
For  illustrative  purposes,  suppose  that  Gg{z)  is  in  the  two- 
parameter  set: 


Gg{z) 


Bg(z) 

Ag(z) 


bz  1 

1  -far-1’ 


9  - 


a 

6 


(9) 


After  some  algebra  one  obtains  the  following  equivalent  in¬ 
put/output  description  of  (8): 


Agy  —  Bgu  —  Bge 


(10) 


where  e(f)  is  an  uncertain  sequence  satisfying 


1(01  <  fl«(0l.  Vt  =  l . N  (11) 

Since  (k,  6,  r)  are  known  and  u(f)  is  measured,  the  upper 
bound  on  the  error  sequence  is  known  at  each  time  instant. 
Combining  the  above'  expressions  with  prior  information  6  £ 
©o,  we  obtain  the  parameter  set  estimate 

©o  n  ©jv 

where  ©//  consist  of  those  9  satisfying, 

IlK*)  +  ay(t  -  1)  -  bu(t  -  i)\  <  ~  1)1. 

for  all  t  =  1, . . . ,  N. 

Example  2:  Output  Nonlinearity 

In  the  above  example,  the  nonlinearity  was  on  the  input.  Now 
consider  the  case  where  the  nonlinearity  is  on  the  output  (see 
Figure  5)  where 

y  =  f(y),  y  —  (13) 


Figure  5:  Input  nonlinearity. 

Proceding  as  before  we  now  have, 

Aey  —  Beit  =  A$e  (14) 

where  now  e(t)  is  an  uncertain  sequence  satisfying 

KOI  <  j-TjlirWI,  vt  =  i,...,iv  (15) 

In  this  case  the  set  estimate  Qn  consists  of  those  9  satisfying, 

|y(0+“y(t-i)~Mt-i)l  <  jf3^(ly(0l +  l°y(i  -  01).  (i6) 

for  all  t  —  1, . . . ,  N. 

Example  3:  Mechanical  System 

Consider  the  mechanical  configuration  depicted  in  Figure  6. 

This  system  represents  the  case  where  tortional  actuation  is 
applied  to  a  load  through  a  flexible  gear-train.  The  gearing 
is  shown  to  occur  at  the  end  of  the  flexible  member,  although 
other  combinations  are  certainly  possible. 

Neglecting  any  electronic  dynamics,  and  assuming  that  the 
flexible  rod  is  both  uniform  and  damped,  the  motion  of  the 


Jmvx  —  u  +  i5(‘;2  —  yi)  +  ff(l/2  —  yi) 

Jaih  =  -u  -  D(y2  —  ji)  -  Kfa  -  yi) 

Jlvi  =  Na 
V  =  yi  -  Ny3 

“  =  f(ii) 

where  u  denotes  the  input  applied  torque,  (jn.yj.i/s)  are  an¬ 
gular  deflections  as  indicated  in  the  figure,  y  is  the  relative 
gear  angle,  and  /(-)  is  a  memoryless  nonlinearity  arising  from 
backlash  in  the  gear  train.  The  constants  are  defined  as  fol¬ 
lows:  Ju,  Jg i  and  Jl  are  the  motor,  motor  gear,  and  load 
inertias,  respec  .vely,  N  is  the  gear  ratio  which  is  greator  than 
one,  and  D,  K  are  the  damping  and  stiffness,  respectively,  of 
the  elastic  rod.  The  backlash  nonlinearity  /(•)  Las  the  typical 
shape  as  shown  in  Figure  ?. 


Figure  7:  A  typical  backlash  function. 

The  break-point  parameter  yt,  relates  to  gear  tecih  spacing 
and  the  slopes  in  the  two  regions  relate  to  gear  teeth  shapes. 
Typically  for  |z|  >  yt,  the  slope  is  very  large  whereas  for  |zf  < 
y b  the  slope  is  very  small.  It  is  clear  that  for  some  positive 
constants  (Jfc,  6, p)  that  /(•)  satisfies  the  sector  condition  (7). 

To  illustrate  how  to  compute  a  set  estimate  for  the  param¬ 
eters  of  the  mechanical  system,  suppose  that  the  measured 
variables  are 


and  that  ( K,D )  are  uncertain  parameters,  i.e.  , 


Figure  6:  A  flexible  rotating  system  with  backlash  in  the 
gear-train. 

rigid  body  and  first  tortional  “mode”  for  small  angular  de¬ 
flections  can  be  approxi  t>  ated  by  the  system  of  differential 
equations, 
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Observe  M>tt  the  input  to  the  nonlinear  function  is 

.  Jl  - 

ti  —  - 1/3 

N  ' 

One  approach  to  describing  a  model  set  for  this  type  of  system 
is  to  approximate  either  the  input  or  output  to  the  nonline  : 
function.  This  would,  be  like  the  ideal  situations  in  the  pre¬ 
ceding  two  example  systems.  In  this  case,  since  yj  is  available 
as  a  measurement,  the  acceleration  ys  can  be  approximated 
by  high  pass  filtering  the  measured  output  jo.  For  example, 
let 


S/3 


=  Fy„  cW  =  (^t) 


where  1/r  is  sufficiently  large  so  as  to  capture  the  dominant 
harmonics  in  the  acceleration.  Then, 

_  .  Jlc- 


With  this  approximation  the  situation  is  very  similar  to  the 
example  wucte  the  nonlinearity  is  on  the  output.  However, 
there  is  one  difference:  here  the  input  to  the  nonlinear  function 
also  contains  a  term  do  to  u,  which  can  also  be  approximated 
by  u.  Thus,  the  appropriate  model  can  be  described  by  the 
feedback  system  shown  in  Figure  8. 


Figure  9:  Standard  model 


1.  Pe  is  a  transfer  matrix  which  depends  on  a  parameter 
6  g  IRP  and  which  has  the  block  structure: 


y  _  Pn  Pi  2  « 

y  P21  P22  a 


(18) 


2.  /(•)  is  a  scalar  memoryless  nonlinearity  in  the  sector, 
<  /(*)  <  Pz,  V|r|  <  P 


Figure  8:  Feedback  nonlinearity. 


The  system  is  described  by, 


«  =  /(»)> 


After  some  algebra,  we  obtain, 


V 


i 


JMxi  +  Jgu 
Ae 


+  yi  —  Nys 


(17) 


where 

Ae{s)  =  JMJGs 2  +  ( JM  +  Ja)(Ds  +  K) 

The  procedure  described  in  Example  1  can  now  be  applied  to 
obtain  a  set  estimate  which  will  contain  the  true  parameters. 
Of  course  the  precise  conditions  under  which  the  true  parame¬ 
ters  are  in  the  set  estimate  involve  various  approximations.  In 
particular,  due  consideration  must  be  given  to  approximating 
u  by  u. 


where  0  <  or  <  /3. 

3.  The  measured  data  set  is 

{!/(*).“(*)  =  <  =  1,  — ,  JV} 

The  standard  form  allows  for  scalar  memoryless  sector 
bounded  nonlinearties,  but  the  measured  signals  (y,  u)  can  be 
vectors.  Disturbances  as  well  as  nonparametric  dynamic  un¬ 
certainties  can  also  be  included  by  replacing  the  “feedback” 
with  a  more  complicated  system  and  by  adding  another  input. 


3  Robust  Linear  Control  Design 

As  an  illustrative  example,  consider  the  uncertain  nonlinear 
plant  with  a  linear  feedback  control, 

y  —  d+  f(y),  y-Gga,  u  =  -Ky  (19) 

whr -a  Gg  and  K  are  linear-time-invariant  systems,  K  is  the 
linear  feedback  controller,  /(•)  is  a  memoryless  nonlinearity, 
is  the  measured  output  to  be  controlled,  and  d  is  a  dis¬ 
turbance  as  seen  at  the  output.  The  control  objective  is  to 
attenuate  the  effect  of  the  disturbance  at  the  output  Respite 
the  uncertainties  in  the  system  model.  Specifically,  the  system 
uncertainties  are  as  follows: 

•  *hc  nonlinear  function  /(•)  is  in  the  sector, 


2.?,  Standard  Model  Structure 


l/(y)  -  ?!  <  <|y|,  V|y|<p 


Even  though  the  three  example  systems  are  fairly  general, 
it  is  also  important  to  point  out  that  they  do  not  exhaust 
all  the  myriad  possibilities.  A  very  general  model  format,  or 
template,  is  characterized  in  Figure  9. 

This  model  form  is  discussed  in  detail  in  [15].  Here  we  make 
the  following  assumptions: 


•  the  parameters  in  the  linear-time-invariant  system  Gg 
are  in  the  set  0. 

Observe  that  these  uncertainty  sets  can  arise  from  a  combi¬ 
nation  cf  set  estimaiioi  and/or  prior  information.  From  the 
control  design  viewp.  ,nt  the  source  of  the  uncertainty  is  not 
rela'ent. 


7  7 


To  analyze  this  system  we  make  the  following  convenient 
(1  (Tuitions: 


A(y)  =  f(y)-y 
So  —  (I  T  GqK)  1 
To  =  GeA'(l  T  G0K)~l  =  1  -  So 


Observe  that  A(  )  satisfies  the  sector  condition 
|A(y)|  <  <|y|,  V|y|  <  p 


The  transfer  functions  ( So, To )  are  the  closed-loop  transfer 
functions  from  disturbance  <2  to  output  y  and  control  u,  re¬ 
spectively,  if  the  nonlinear  function  /(•)  is  replaced  by  the 
identity  operator,  which  in  this  case  is  the  “nominal”  non¬ 
linearity.  The  nonlinear  feedback  system  is  then  equivalently 
expressed  as: 


y  -  Se(d  T  e) 
e  =  A(y) 
y  =  -To(d  +  e) 


Now,  let  Tfl(t)  denote  the  impulse  response  of  Te(a),  and  sup¬ 
pose  that  there  are  constants  M  >  1,  a  >  0,  and  r  >  0, 
independent  of  6,  such  that  for  all  t  >  0, 


|T,(<)|  <  Me~at 

KTsd)(f)|  <  r 

Application  of  the  Bellman  inequality  [11]  yields: 

Sr 


mi  < 


1  -  SAf/a 


provided  that 


a 

<  AT 


r  <  (1—  6Af/a)p 


The  above  inequalities  bound  e(f),  which  appears  as  an  addi¬ 
tional  disturbance.  Thus,  the  ideal  closed-loop  transfer  func¬ 
tions  (So, To)  must  be  shaped  to  make  e(t)  small.  In  addition, 
the  linear  controller  K  has  other  goals  e.g.  ,to  robustly  stabi¬ 
lize  the  linear-time-invariant  model  set  {Ge  :  9  6  ©}. 


4  Concluding  Remarks 


A  separation  principal  between  model  set  estimation  and  ro¬ 
bust  control  design  allows  for  a  more  comprehensible  approach 
to  adaptive  control  design.  This  approach  differs  from  its  pre¬ 
decessors  in  that  model  uncertainty  is  incorporated  in  the  syn¬ 
thesis  phase  of  the  design  rather  than  in  the  an/yju  phase. 
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A  Family  of  Norms  For  System  Identification  Problems 
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Abstract 

In  this  paper  we  introduce  a  family  of  norms  that  may  prove  useful  in  system  identification 
problems.  The  important  property  of  the  new  norm  is  that  for  a  given  sequence  its  value  in  the 
limit  will  converge  to  the  supremum  over  all  frequencies  of  the  spectrum  of  the  sequence.  Using 
this  property,  a  procedure  is  outlined  to  approximately  minimize  the  weighted  C<x>  norm  of  the 
frequency  response  estimation  error. 

1  Introduction 

The  parametric  approach  to  system  identification  is  based  on  selecting  an  appropriate  model  structure 
and  a  search  for  the  parameters  of  the  model  that  best  describes  the  data.  Usually,  the  best  model 
within  the  model  set  is  characterized  as  the  one  that  minimizes  a  selected  norm  of  the  prediction 
errors.  By  far  the  most  popular  norm  is  the  sum  of  the  square  of  the  prediction  errors-  the  quadratic 
norm.  In  this  paper  we  introduce  a  new  family  of  norms  that  seem  to  be  useful  in  system  identification 
problems.  The  new  norms  have  an  interesting  interpretation  in  the  frequency  domain  and  include  the 
usual  quadratic  norm  as  a  special  case.  The  important  property  of  the  new  norm  is  that  in  the  limit 
its  minimization  is  equivalent  to  minimizing  the  supremum  over  all  frequencies  of  the  spectrum  of  the 
prediction  error,  or  equivalently  minimizing  its  norm. 

2  Definitions  and  Preliminaries 

Let  us  assume  we  are  given  a  scalar  bounded  sequence  {e,,  i  =  1  ,.,.,/V}  which  in  our  application 
represents  the  prediction  errors  computed  from  the  observed  data  and  a  guessed  model  parameter 
vector  8.  Based  on  this  sequence,  form  the  (N  +  M  -  1)  x  M  matrix 

ei  0  •••  0 

e2  e\  •••  0 

cm  zm- i  ei 

:  :  :  •  (1) 

CAT  eu-\  •••  ex-M+i 

0  e^v  e/v_M+ 2 

0  0  •••  e/v 
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where  we  assume  1  <  M  <  N .  Note  that  E^m  is  constant  along  the  diagonals,  and  for  M  -  1,  Efj\ 
is  a  column  vector  with  Ci/\fN  as  its  elements.  To  simplify  the  notation  we  denote  this  vector  by  E^. 
Moreover,  the  matrix  frqvM  is  completely  specified  when  Ep(—  A’/vi)  and  the  value  of  M  are  given. 

It  is  simple  to  see  that  the  matrix  EjyjME^M  is  symmetric,  at  least  positive  semi-definite,  and 
Tocplitz.  The  elements  of  this  matrix  are  estimates  of  the  autocorrelation  function  of  the  sequence  < 
More  explicitly,  define  the  sequence  a,  (i  =  0, . .  .,M  —  1 )  in  terms  of  c,  ;is  follows: 


Then  a  simple  computation  shows: 


N- 


,1‘  “  yy  +  ‘ 


J  =  l 


pT 


Enm 


a0 

ax 

O-M- 1 

a\ 

ao 

O-M-7 

aM-\ 

O-M-7 

ao 

Using  these  definitions,  we  define  the  new  norm  as  the  maximum  eigenvalue  of  EJ,mEnm, 

Vm(En)  =  A  =  o2(Enm) 


(V 


(3) 


(4) 


where  A (F)  denotes  the  maximum  eigenvalue  of  F  and  o(F)  denotes  the  maximum  singular  value 
of  F.  For  simplicity,  we  usually  delete  the  argument  of  Vm  and  assume  it  is  understood  to  be  a 
function  of  En  which  is  itself  formed  from  the  prediction  errors  e,.  Note  that  Vm  defined  in  (4)  is  not 
mathematically  a  norm  on  7iN-1  however  \/Vm{E^)  is  a  valid  norm  for  E^i ,  and  only  to  simplify  the 
presentation  we  refer  to  Vm  as  a  norm. 

Also,  for  M  =  1,  Vm  is  identified  with  the  usual  quadratic  norm.  From  another  point  of  view, 
Vi  only  includes  an  estimate  of  the  autocorrelation  function  of  the  prediction  error  for  zero  shift,  a0. 
Moreover,  Vm  is  nicely  bounded  by  V\  as  follows: 


\\En\\]  =  V\ ( En )  <  Vm(En)  <  MVt(EN)  -  M||/^||l 


(fi) 


To  illustartc  some  of  the  properties  of  Vm  for  M  >  1,  assume  M  —  2. 
of  EJj2Eh2  is  simple  to  compute  and  is  given  by 


The  maximum  eigenvalue 


V2  -  a0  +  |aj| 


(6) 


In  this  case,  not  only  the  sum  of  square  of  prediction  errors  is  included  in  the  performance  measure, 
but  this  norm  also  includes  an  estimate  of  the  autocorrelation  function  of  the  prediction  error  at  the 
first  time  shift.  Therefore,  minimizing  Vi  will  force  |ai|  to  small  values.  This  is  a  first  attempt  to 
whitening  the  prediction  error  in  addition  to  minimizing  its  variance  . 

Note  that  the  whiteness  of  the  prediction  error  is  an  important  factor  in  validating  a  computed 
model  [5].  However,  this  desirable  property  of  the  prediction  error  is  not  reflected  in  any  form  in  the 
usual  quadratic  norm.  Hut  Vm  not  only  is  a  function  of  the  variance  of  the  prediction  error  but  it  also 
is  a  function  of  the  values*  of  the  autocorrelation  of  the  prediction  error  for  time  shifts  up  to  M  —  1, 
and  by  increasing  M  more  and  more  of  the  temporal  behavior  of  this  autocorrelation  affects  Vm- 
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3  Frequency  Domain  Properties 


Now  we  discuss  the  frequency  domain  interpretation  of  the  new  norm.  First  assume  tiie  limit  of  a, 
defined  in  (2)  as  N  goes  to  infinity  exists: 

lim  a;  —  a:  (7) 

N->oo  w 

If,  in  addition,  a,  is  in  /lf  then  the  spectrum  of  the  prediction  error  is 

OO 

5ee(^)  =  E  ak<Tiuk  (8) 

k=— oo 


where  we  set  a_t  =  a*,  because  we  are  dealing  with  a  real  sequence.  It  is  shown  in  [3]  that  the  following 
are  true 


00  ~  I  (9) 

A(Ca*)  <  sup  Ste(u>)  =  lim  A  (Cm)  (10) 

|w|<ir  oo 

A (Cm)  >  inf  5ee(w)  =  lim  A (CM)  (11) 

|w|<7T  M— -OO 


where  A(i?)  and  A(F)  denote  respectively  the  smallest  and  the  largest  eigenvalue  of  F ,  and  the  Toeplitz 
matrix  Cm  is  defined  as  follows: 


Go 

a  i 

•••  O-M- 1 

GO 

‘ "  •  %- 2 

GAf-1 

2 

G0 

(12) 


To  explore  the  convergence  property  of  (10)  as  M  goes  to  infinity,  let  us  consider  an  exponentially 
correlated  sequence  e,-  with  autocorrelation  function  given  by 


o-k  —  0  <  r  <  1 


(13) 


The  spectrum  of  e,  is  simple  to  compute 

c  ,  s _ 1  ~  r2 

«(w)  1  —  2r  cosu;  + 


(14) 


Let  us  denote  the  supremum  of  See(w)  by  5.  In  Figure  1  the  values  of  100(5  —  A(Ca/))/5  are  shown  as 
a  function  of  M  for  values  of  r  from  0.1  to  0.9.  As  can  be  seen,  for  small  values  of  r  (slightly  correlated 
sequences)  the  convergence  is  rather  fast.  However,  as  r  gets  closer  to  one  the  number  M  for  achieving 
a  preset  accuracy  increases  considerably.  This  figure  is  very  useful  in  selecting  an  appropriate  value 
for  M  when  a  bound  for  the  spectral  content  of  the  prediction  error  is  known.  Moreover,  explicit 
computation  shows  that  for  a*  given  in  (13)  the  convergence  of  A (Cm)  to  mf^  5ee( tu)  is  considerably 
faster  than  those  observed  in  Figure  1. 


Theorem  1  The  following  limits  hold 

lim  eJ/Eh  = 

N—*oo 

,  lim  (  lim  o7{Enm))  = 

M—oo  oo 

lim  (  lim  a 2(Enm))  = 

M— *oo  /V— foo 

where  wr  assume  that  N  goes  to  infinity  faster  than  M 


h  r  s"{u,)du 

(15) 

sup  5ee(u>) 

(16) 

M<* 

inf  5«(u>) 

(17) 

M<» 
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Figure  1:  Convergence  of  \(Cm)  to  supwSee 


Proof:  Relation  (15)  follows  from  the  definition  of  oq  and  (9).  Moreover,  by  definition  of  Cm  we  have 

lim  EJ/mEnm  =  Cm  (18) 

Tv  — »oo 

Substituting  this  in  (10)  and  (11)  and  noting  that  the  eigenvalues  of  a  matrix  are  continous  functions 
of  the  elements  of  the  matrix  the  other  results  follow  immediately. 

□ 

In  identification  problems  we  estimate  the  model  paxameters  6  by  minimizing  Vm(En(0)).  The 
notation  En{6)  emphasizes  the  fact  that  the  prediction  error  is  a  function  of  6  and  the  minimization 
is  carried  over  elements  of  6.  Relation  (16)  is  very  illuminating  in  this  respect  and  shows  that  by 
minimizing  Vm  as  M  approaches  infinity,  the  supremum  over  all  frequencies  of  the  spectrum  of  pre¬ 
diction  error  is  minimized.  Because  of  this  property,  we  refer  to  the  identification  problem  using  the 
new  norm  as  the  identification  problem.  In  contrast,  by  minimizing  the  usual  quadratic  norm, 
the  integral  of  the  spectrum  of  prediction  error  over  all  frequencies  is  minimized  [5],  and  this  can  be 
referred  to  as  C,i  identification  problem  (see  (15)). 

As  an  aside,  using  (16)  and  (17),  it  is  clear  that  the  condition  number  of  E^m  is  a  Q°°d  indication 
of  the  whiteness  of  the  sequence  Es-  When  this  condition  number  is  close  to  1,  the  spectral  density 
function  is  close  to  being  constant  over  all  frequencies  and  the  sequence  is  close  to  being  uncorrelated. 
Large  values  of  the  condition  number  indicate  that  the  sequence  is  correlated  and  the  maximum  and 
minimum  value  of  the  spectral  density  are  far  apart. 

Now  we  explore  the  usefullness  of  the  new  norm  in  identification  problems  and  relate  the  C0 0  norm 
of  the  spectrum  of  the  prediction  error  to  £<»  norm  of  the  transfer  function  estimation  error.  Following 
the  procedure  used  in  [5],  let  us  assume  the  true  system  output  is  generated  by 

Vt  —  Go(q)ut  +  V(  (19) 

where  the  additive  noise  vt  has  the  spectrum 

Svv(“)  =  A0|//o(^)j2  (201 
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with  JIq(oo)  —  1.  Also  assume  the  suggested  model  for  the  system  has  the  form 

Vt  =  0)u(  +  II(q,6)et  (21) 

where  0  is  the  vector  of  unknown  parameters.  It  is  simple  to  show  that  the  spectrum  of  the  prediction 
error  in  this  case  is  given  by  [5]: 

„  ,  n  \G{e^,0)\2SM  +  SM 

Sce{“' e)  =  fTT^djp  (22) 

where  G  —  G  — .  Go  is  the  error  in  estimating  the  transfer  function. 

Unfortunately,  the  term  SVV/\H\2  in  (22)  which  depends  on  the  parameter  9  prevents  us  from 
directly  relating  the  minimization  of  See  to  the  minimization  of  |Gj.  To  circumvent  this  difficulty,  we 
can  first  use  a  high  order  ARX  model 

A(q)y*  =  B{q)ut  +  et  (23) 

to  approximate  Ilo(e^)  by  1/A(e^u’),  and  filter  both  ut  and  y<  by  A(q).  Let  us  denote  the  filtered 
input  and  output  by  u[  and  y{  respectively.  Next  use  the  following  output  error  model  to  estimate 
the  model  parameters  6 

y{  -  <^(?>  d)ui  +  et  (24) 

Now  using  (22)  we  have 

see{ «,  9)  =  \G{e?w,  0)12|  A(e*“)\2Suu{u)  +  |  A(e>“)|2Sim(rn)  (25) 


If  1/|A|  is  a  good  approximation  to  |/f0|,  then  the  last  term  in  (25)  is  a  constant  equal  to  Ao,  and  we 
can  write 

|5(^,0)|2|A(e^)|25uu(W)  *  Sce(“J)  -  (26) 

Using  (26),  it  is  clear  that  minimizing  the  supremum  of  See  ,rl  this  case  will  directly  lead  to  the 
minimization  of  the  weighted  Coo  norm  of  G.  Note  that  as  is  expected,  the  weighting  |  A|2SUU  (« 
Sua/ltfol2)  puts  more  emphasis  on  the  frequency  ranges  where  the  signal  to  noise  spectral  ratio  is 
large.  Also,  by  repeating  the  experiment  with  a  different  input  (changing  Svu),  we  have  the  flexibility 
of  changing  this  weighting  factor. 

However,  the  approach  we  have  outlined  has  a  major  draw  back  because  it  relies  on  using  the 
output  error  form  in  (24).  The  norm  of  the  prediction  error  in  this  case  is  not  necessarily  a  convex 
function  of  the  model  parameters,  and  this  may  lead  to  a  complicated  minimization  problem. 

Note  that  after  minimizing  Vm  (for  sufficiently  large  value  of  M),  we  can  compute  a  good  estimate 
for  the  supremum  over  all  frequencies  of  the  left  hand  side  of  (26).  Since  the  supremum  over  all 
frequencies  of  the  first  term  on  the  right  hand  side  of  (26)  can  be  approximated  by  the  minimum  value 
of  Vm,  and  the  value  of  Ao  (variance  of  the  noise)  can  be  approximated  when  we  are  computing  the 
ARX  structure  in  (23).  This  gives  a  bound  for  the  norm  of  the  modeling  error. 


4  Convergence  and  Convexity 

The  norm  introduced  in  (4)  has  some  interesting  properties  that  we  shall  discuss  next.  Let  us  fix 
M ,  and  assume  we  are  given  a  model  structure  and  identify  the  parameter  vector  6  of  this  model 
by  minimizing  Vp(Ev(6))  where  P  is  a  positive  integer  less  than  M  (P  <  M).  Let  us  assume 
this  optimization  problem  has  a  unique  global  minimum  that  we  will  denote  by  9P ,  and  denote 
the  prediction  error  sequence  resulting  from  this  choice  of  the  parameter  vector  by  Efc  =  E^(0P). 
Similarly  define  0M  and  E$  ~  Ek(Om). 
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Theorem  2  The  following  series  of  inequalities  hold: 

W%)  <  Vp(E%)  <  VM(E%)  <  Vm(E^) 

Proof:  Beginning  from  the  left  hand  side,  the  first  inequality  follows  from  the  fact  that  the  elements 
of  Eft  are  generated  from  the  model  parameters  that  minimize  V/>.  The  second  inequality  follows 
from  the  interlacing  property  of  the  eigenvalues  of  a  symmetric  matrix  [2j.  Note  that  (Efjp)1  Epjp  is 
the  first  P  X  P  principle  minor  of  (E$M)T where  E^P  and  E$M  are  defined  in  terms  of  Eftf 
using  (1).  The  third  inequality  follows  from  the  fact  that  Bjtf  is  formed  from  model  parameters  tliat 
minimize  Vm- 

□ 

The  relation  given  in  (27)  is  specially  uscfull  if  we  set  P  =  1.  Then  Vj(.E’jy)  is  the  minimum 
value  of  the  usual  least  squares  performance  measure.  Also  in  this  case  we  can  add  another  important 
inequality  to  the  set  given  in  (27). 

Corollary  1  The  following  series  of  inequalities  hold 

V, (Ejq)  <  V,(£#)  <  VM{Ef$)  <  Vm(E'n)  < 

Vi(E'„)  +  (M  -  1)  max(|a}|, . . . ,  K,_,|)  (23) 

where  aj  are  computed  from  the  elements  of  Ejq  using  the  relation  given  in  (2). 

Proof:  The  first  three  inequalities  follow  by  setting  P  ~  1  in  (27).  Moreover,  because  Q  = 
(22jvM)T Exnm  is  Toeplitz  with  aj  on  its  main  diagonal,  each  eigenvalue  of  Q  denoted  by  A  satisfies  the 
following  inequality 

| A  —  (Zq|  <  (M-  l)max(|a||,...,|a^_,|)  (29) 

This  follows  from  Gershgorin’s  circle  theorem  [2]  and  hence  the  last  inequality  in  (28)  holds.  Note 
that  the  a }  in  (28)  are  estimates  of  the  autocorrelation  function  of  the  prediction  error  computed  from 
parameters  that  are  obtained  by  minimizing  the  quadratic  norm. 

□ 

Now  let  us  assume  that  for  a  particular  problem  Vj(E/v(0))  and  Vm(En(0))  both  have  unique 
global  minimum  that  are  denoted  by  6l  and  6M  respectively.  Moreover,  let  us  assume  that  in  this 
problem,  the  last  term  in  (28)  goes  to  zero  as  the  number  of  data  points  increases.  In  other  words 
assume  for  a  fixed  M  we  have 

lim  max(|aj|,...,|aif_1l)  =  0  (30) 

N  —MX) 

Then  using  (28),  it  is  clear  that 

lim  Vt{EN{9'))=\ im  Vl{EN{eM))  (31) 

/V— *oc  N— *oo 

Now  using  the  assumption  on  the  uniqueness  of  the  global  minimum  of  Pj,  it  is  clear  that  in  the 
limit  9l  and  0M  will  be  identical.  Put  it  more  loosely,  if  the  prediction  error  for  the  quadratic  norm 
minimization  is  white,  then  the  parameters  obtained  by  minimizing  the  new  norm  will  be  identical  to 
those  obtained  by  minimizing  the  usual  quadratic  norm. 

To  guarantee  that  each  Vj  and  Vm  have  global  minima  only,  let  us  choose  an  ARX  model  for  the 
structure  of  the  system.  In  this  case  it  is  well  known  that  the  scaled  prediction  error  can  be  written 
as 

En  =  ^R{Y-*0)  (32) 
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where  4>  is  the  matrix  of  regression  vectors  and  Y  is  the  vector  of  output  values  [5].  In  this  case, 
Vj (En(9))  is  a  convex  function  of  0,  and  the  minimization  problem  has  only  global  minima.  Moreover, 
if  4>  is  full  column  rank,  then  the  minimum  is  unique. 

Next  we  show  that  for  an  ARX  model  structure,  Vm  is  also  a  convex  function  of  the  jMiramcters. 
To  see  this,  note  that  the  matrix  Enm  can  be  written  as 

M  .  M 

Enm  =  £  TiENwJ  =  -7=Y,  7  «(y  -  *0)WT  (33) 

i-  i  v N  ;=1 

where  te,  £  V,M  is  the  standard  basis  column  vector  with  1  in  its  i-th  entry  and  all  other  elements 
zero.  Also  the  ( N  -f  M  —  1)  x  N  matrix  T,  is  defined  as  follows: 


Ti  = 


®(i-l)xN 

Inxw 


(34) 


Moreover,  denote  the  i-th  column  of  $  by  <£,-  and  the  i-th  element  of  0  £  7ZL  by  <?,.  Then  (33)  can  be 
rewritten  as 


Enm  — 


Co 


Ci  = 


i=i 
1  M 

7nEt*wI 


(35) 


Note  that  Cj,  j  =  0, . . . ,  L,  have  the  same  special  structure  as  Enm  namely  being  constant  along  the 
diagonals.  Now  using  (35),  it  is  clear  that  Enm  is  affine  in  0,  and  o(Enm)  is  a  convex  function  of 
0.  Therefore,  VM  being  the  square  of  o(Enm)  is  also  a  convex  function  of  0  and  the  minimization 
problem  has  only  global  minima  in  this  case. 

Using  these  facts,  if  we  use  an  ARX  model  structure  and  if  it  happens  that  the  resulting  prediction 
errors  are  white  (and  consequently  the  relation  (30)  holds),  then  we  are  guaranteed  that  the  parameter 
estimate  using  the  new  norm  will  be  same  as  the  parameters  using  the  quadratic  norm.  This  is 
promising  because  for  the  sum  square  norm  and  ARX  structure  there  are  many  established  properties 
[5]  that  readily  extend  to  the  new  norm. 

However,  if  the  prediction  error  sequence  is  not  white,  which  will  be  the  case  if  the  ‘true’  model 
does  not  have  an  ARX  structure,  then  the  estimate  given  by  minimizing  Vm  will  usually  be  different 
from  those  obtained  from  the  quadratic  norm  minimization.  Note  that  the  new  norm  forces  the 
autocorrelation  of  the  prediction  error  for  nonzero  shifts  to  small  values  (whitens  the  prediction  error) 
and  this  proporty  may  result  in  a  better  estimate  of  the  model  parameters  (compared  to  qudratic 
norm  for  a  given  model  order)  when  the  true  model  is  not  actually  inside  the  model  set. 

As  we  have  shown  previously,  for  an  ARX  model  structure,  the  matrix  Enm  is  affine  in  the 
parameters  and  we  are  interested  in  minimizing  the  maximum  singular  value  of  Enm ■  This  problem 
is  already  discussed  in  the  literature  [7,  4]  and  a  recent  algorithm  is  proposed  in  (1).  However,  by 
exploiting  the  special  structure  of  the  matrices  Ci  defined  in  (35),  it  may  be  possible  to  increase  the 
efficiency  of  the  algorithm  in  (1).  Also  in  our  application,  the  size  of  the  matrices  involved  is  quite  large 
and  special  attention  should  be  paid  to  the  memory  management  and  algorithmic  implementation; 
otherwise  huge  amounts  of  memory  will  be  required  to  perform  the  optimization  even  for  modest 
values  of  M  and  N . 
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5  Numerical  Example 

We  solved  a  numerical  example  to  illustrate  the  properties  of  the  new  norm.  For  performing  the 
minimization,  we  used  the  standard  OPTIMIZE  routine  available  in  MATRIXx  softwrae  package  [6]. 
The  true  model  was  chosen  to  be  the  zero  order  hold  equivalent  of  a  second  order  lightly  damped 
mode  sampled  at  1  Hz.  The  frequency  response  of  the  true  model  is  shown  in  Figure  2  as  a  solid  line. 
The  measured  output  was  assumed  to  be  the  sum  of  the  output  of  the  true  model  and  a  filtered  white 
guassian  pseudo  random  sequence.  The  input  is  a  white  pseudo  random  guassian  sequence.  The  signal 
to  noise  ratio  is  chosen  to  be  5.  The  number  of  data  points  used  is  512. 

We  assumed  a  second  order  ARX  model  for  the  system  and  estimated  the  parameters  of  the  model 
by  minimizing  V32 (Ejs)  and  Vi(Eh).  Note  that  the  true  model  is  in  output  error  [5]  form.  The 
resulting  estimated  transfer  functions  are  denoted  by  G 32  and  G1  respectively,  and  the  true  transfer 
function  is  denoted  by  Go  with  the  magnitude  of  the  frequency  responses  shown  in  Figure  2.  The 
magnitude  of  the  errors  Gq  —  G32  and  Go-G1  are  shown  in  Figure  3.  The  spectrum  and  autocorrelation 
of  the  prediction  errors  e 32  and  e}  that  are  obtained  from  the  optimal  parameter  estimates  032  and 
6l  respectively  are  shown  in  Figures  4  and  5.  The  spectrum  is  estimated  using  a  Hamming  window 
with  a  length  of  32  points. 

For  the  optimal  estimates,  the  values  of  the  objective  functions  are  as  follows: 

V32{EN{032))  =  0.1926,  Vi{ Evie32))  =  0.1473  (36) 

Vi(Erf(e1))  -  0.0956,  VhC-M#1))  =  0-2781  (37) 

Note  that  the  values  of  V-j2(i?Ar(032))  and  V32(En(61))  are  in  close  agreement  with  the  maximum  of 
the  spectrum  of  the  prediction  errors  that  are  shown  in  Figure  4.  Moreover,  the  values  of  nonzero 
shifts  of  the  autocorrelation  of  ef 2  are  much  smaller  than  those  of  e] .  In  other  words,  e32  is  close  to 
being  white  but  ej  is  clearly  correlated.  However,  the  variance  of  ef2  is  considerably  larger  than  that 
of  ej . 
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Figure  3:  Magnitude  of  Frequency  Response  of  error 


Figure  4:  Estimated  Spectrums  of  Prediction  Errors 
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Figure  5:  Autocorrelations  of  Prediction  Errors 


6  Conclusion 

Although  we  have  presented  some  preliminary  results  on  the  properties  of  the  £<»  identification  prob¬ 
lem  in  this  paper,  much  further  work  is  required  to  explore  the  properties  of  the  new  norm  in  details. 
To  perform  this  task,  an  efficient  implementation  of  the  required  minimization  algorithm  is  required 
so  realistic  high  order  models  can  be  estimated  and  their  properties  can  be  compared  with  those  of 
the  least  square  minimization.  As  we  previously  noted,  the  convexity  of  the  new  norm  when  an  ARX 
model  is  used  is  an  important  property,  and  hence  many  techniques  of  convex  optimization  can  be 
used  for  the  solution  of  this  problem. 
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Abstract  A  new  approach  is  given  for  the  design  of  adap¬ 
tive  robust  control  in  the  frequency  domain.  Starting  with  an 
initial  model  and  a  robust  stabilising  controller,  the  new  (wind¬ 
surfer)  approach  allows  the  bandwidth  of  the  closed-loop  system 
to  be  increased  progressively  through  an  iterative  control-relevant 
system  identification  and  control  design  procedure.  Encouraging 
results  are  obtained  in  the  case  studies  that  serve  as  a  benchmark 
test  for  the  new  idea. 

1  Introduction 

It  has  long  been  understood  that  a  key  problem  in  control  systems 
design  is  to  handle  the  uncertainties  associated  with  the  plant  [12]. 
Two  main  techniques  for  the  analysis  and  design  of  systems  with 
significant  uncertainties  are  adaptive  control  [8]  and  robust  control 

[6  IS]. 

In  the  traditional  approach  to  analysis  and  design  of  an  adap¬ 
tive  control  system  [8],  it  is  assumed  that  the  unknown  plant  can 
be  represented  by  a  model  in  which  everything  is  known  except 
for  the  values  of  a  finite  number  of  parameters.  Once  the  param¬ 
eters  are  estimated  (and  even  during  the  estimation  process),  the 
principle  of  certainty  equivalence  is  normally  invoked  to  update 
the  controller.  Normally  the  unstructured  uncertainties  of  the 
model  are  ignored  in  this  approach.  Therefore  it  is  not  surprising, 
as  pointed  out  in  [18],  that  these  adaptive  controllers  are  often 
not  robust.  Further,  the  extensions' of  the  traditional  approach  to 
adaptive  control  which  purportedly  cope  with  unstructured  (and 
other)  uncertainties  involve  conditions  which  are  often  hard  to  ap¬ 
ply  or  to  grasp  intuitively,  see  for  example  [1,  3,  13).  A  further 
problem  with  the  traditional  approach  is  that  extreme  transient 
excursions  are  possible  even  when  global  convergence  and  asymp¬ 
totic  performance  are  guaranteed  [21]. 

To  be  more  specific,  we  consider  an  adaptive  control  system  as 
shown  in  figure  1,  where  G  is  the  unknown  transfer  function  of  the 
plant.  The  time  axis  is  divided  into  intervals  such  that  during  the 
ilfc  interval,  the  control  input  applied  to  the  plant  is  obtained  from 
/Vi ,  where  Kf  is  the  transfer  function  of  the  controller  designed 
using  the  model  <7,_ i  obtained  at  the  end  of  the  (i  —  l)1*  time 
interval. 

In  an  adaptive  control  problem,  the  ulterior  objective  for  find¬ 
ing  Gi,  an  estimate  of  G  updated  from  Gi~i,  is  to  redesign  a  better 
controller  /V,  +. 1  than  Ki,  such  that  certain  control  objectives  are 
improved.  For  example  if  T4  represents  the  desired  complemen¬ 
tary  sensitivity  function,  then  we  may  <ce  to  have 


OK.  ^ 

GKi- 1  _*f|l 

1  +CA-. 

l+CA'.-i  || 

Implicitly,  this  means  we  would  like  to  minimire 

j|  OKl_  _  J!  V1 

il  t  t  <;k.  IL 


Since  G,  the  transfer  function  of  the  plant,  is  unknown,  we  could 
only  base  our  design  of  Ki  on  Gi_i  such  that 

•  vi- 

Note  that,  as  usual,  we  have  invoked  the  principle  of  certainty 
equivalence.  However,  it  is  important  to  realize  that 


GKi 

1+GKi 


is  not  necessarily  small,  even  though 


Ct-iKt  y.1 

t+c.-./v,  IL 

is  a  minimum.  This  partly  explains  why  traditional  adaptive  con¬ 
trol  systems,  which  invariably  invoked  the  principle  of  certainty 
equivalence,  have  unsatisfactory  robustness  property. 

In  the  robust  control  approach  [6,  15],  a  controller  is  designed 
based  on  a  nominal  model  of  the  plant  with  the  associated  para¬ 
metric  and  unstructured  model  uncertainties  explicitly  taken  into 
account.  Therefore  stability  robustness  is  guaranteed  and  perfor¬ 
mance  robustness  is  achieved  sometimes.  The  weakness  of  this 
approach  is  that  it  considers  only  the  a  priori  information  on  the 
model,  and  neglects  the  fact  that  characteristics  of  the  plant  could 
be  learnt  while  it  is  being  controlled.  Therefore,  the  robust  con¬ 
trol  approach  tends  to  result  in  a  conservative  design  in  terms 
of  performance.  It  is  likely  that  a  posteriori  knowledge  about  the 
plant  could  be  used  to  reduce  the  conservatism  in  a  robust  control 
design. 


2  The  Windsurfer  Approach  to  Adap¬ 
tive  Control 

By  considering  how  humans  learn  windsurfing,  Anderson  and  Kosut 
[2]  have  made  the  following  observations: 

1.  The  human  first  learns  to  control  over  a  limited  bandwidth, 
and  learning  pushes  out  the  bandwidth  over  which  an  accu¬ 
rate  model  of  the  plant  is  known. 

2.  The  human  first  implements  a  low  gain  controller,  and  learn¬ 
ing  allows  the  loop  to  be  tightened. 

Based  on  these  observations  an  adaptive  robust  control  design 
philosophy.  Me  windsurfer  approach ,  is  proposed  in  [2].  It  recog¬ 
nizes  that,  at  the  outset,  the  plant  characteristics  can  differ  greatly 
from  the  estimated  model  at  any  one  time,  particularly  during  the 
initial  learning  stage.  In  the  new  design  approach,  a  low  gain  con¬ 
troller  will  first  be  implemented;  and  the  control  bandwidth  will 
be  small.  Based  on  learning  a  frequency  domain  description  of  the 
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plant  in  closed *  loop,  with  the  learning  process  progressively  in¬ 
creasing  the  bandwidth  over  which  the  plant  is  accurately  known, 
the  controller  gain  can  he  increased  appropriately  over  an  increas¬ 
ing  frequency  band.  For  details,  refer  to  |2).  Importantly,  tn  the 
method  suggested,  the  necessary  closed-loop  system  identification 
task  is  simplified  into  an  open-loop  system  identification  problem 
through  the  use  of  coprime  fractional  representations  as  discussed 
in  (9,  10]. 

It  was  shown  recently  in  [19]  that  the  best  model  for  control 
design  cannot  be  derived  from  open-loop  experiments  alone.  The 
controller  to  be  implemented  should  be  taken  into  account  by  the 
system  identification  experiments.  However,  this  controller  is  not 
yet  available,  as  its  determination  rests  on  the  results  of  the  sys¬ 
tem  identification  to  be  carried  out.  Hence,  a  general  solution 
to  the  combination  of  system  identification  and  control  design  is 
necessarily  iterative.  It  was  also  shown  in  [22]  that  an  iterative 
approach  for  model  refinement  and  control  robustness  enhance¬ 
ment  can  be  developed  for  a  //j  control  problem.  Although  the 
emphasis  of  [19]  is  on  the  problem  of  modeling  for  control  design, 
its  approach  is  very  similar  to  that  of  [2].  In  the  next  section,  we 
would  like  to  illustrate  the  windsurfer  approach  by  considering  a 
model  matching  problem  in  the  context  of  adaptive  control. 

3  Adaptive  Model  Matching 

Let  G  be  the  uakaown  transfer  function  of  the  plant,  and  let 
T*  represent  a  desired  complementary  sensitivity  function.  We 
wish  to  achieve,  through  iterative  system  identification  and  control 
design,  the  minimization  of  the  cost  function 


is  no  longer  small,  where 


l  A1,  i 


G A/v,« 

1  i  (j  f\  /vt, 


U3) 


is  the  actual  closed-loop  transfer  function  of  the  system. 

At  this  stage  it  is  necessary  to  improve  the  accuracy  of  the 
model  in  such  a  way  that  is  relevant  to  the  control  objective. 
This  means  that  wc  should  try  to  find  an  updated  model  G,+  i 
such  that 


G»+|  =  arg  min 


G!<S't 
l  +  <7  A'*,, 


01<s,  11 

l  +  0 A7#.«  || 


(31) 


Equation  3.4  would  be  the  formulation  of  a  standard  rational 
function  approximation  problem,  provided  that  G  were  known.  In 
the  simulation  (section  6),  we  shall  lake  this  approach  by  using  a 
known  transfer  function  for  G  This  serves  as  a  benchmark  test  of 
the  windsurfer  approach  as  it  corresponds  to  performing  system 
identification  with  an  infinite  number  of  noiseless  measurements. 
It  is  a  topic  of  further  research  to  deal  with  this  problem  in  a 
realistic  system  identification  setting  when  only  a  finite  number 
of  (possibly  noisy)  input-output  measurements  are  available. 

Once  Gi+ ,  is  found,  we  can  continue  to  increase  the  closed-loop 
bandwidth  by  repeating  the  procedure  described  for  Gi  previously. 
However  Gi+i  should  be  used  instead  of  Gi,  and  we  specify  a  new 
sequence  of  functions  with  T4 i_,-+i  =  T*N.i-  The  iterative 

process  is  continued  until  the  end  control  objective  is  achieved  or  it 
is  prematurely  terminated  because  of  one  or  more  of  the  following 
constraints: 


GK 
l  +  GK 


1.  fundamental  performance  limitations  due  to  right  half  plane 
poles  and  zeros  of  the  plant  and/or  models  [7], 


where  K  is  the  transfer  function  of  a  controller  to  be  design. 

We  begin  by  designing  a  controller  Ki  o  to  stabilize  a  known 
initial  model  Go,  which  may  be  obtained  from  an  open-loop  sys¬ 
tem  identification  exercise.  If  also  stabilises  the  unknown 
transfer  function  G,  then  we  say  that  K ,to  robustly  stabilizes  Go- 
Notice  that  we  use  Kj,i  to  denote  the  j"*  controller  designed  us¬ 
ing  the  i,k  model  which  has  a  transfer  function  Gj.  In  general,  we 
attach  the  subscript  j,  i  to  s  transfer  function  to  denote  that  it 
is  either  spicified  or  derived  on  the  basis  of  the  i,k  model  for  the 
plant  at  the  jlk  iteration  of  control  design.  Since  Go  may  involve 
significant  uncertainties,  the  resulting  controller  K i,o  may  not  be 
able  to  achieve  a  small  value  for 

gp^l.o  -pill 

1+Goffi.o  loo 

while  robustly  stabilizing  Go-  In  general,  we  need  to  consider  how 
to  handle  the  question  of  securing  robust  stabilization  of  Gi  by 
Kjj.  This  is  bound  up  with  the  question  of  selection  of  T4.  It  is 
in  fact  to  be  expected  that  a  sequence  of  T*  will  be  selected  in 
such  a  way  that  the  end  control  objective  can  be  approached  in 
stages-  We  shall  therefore  proceed  as  follows. 

Associated  with  each  of  the  models  G, .  a  sequence  of  con¬ 
trollers  Kj'i  is  to  be  designed  such  that 

*'•  <J1> 


where  the  sequence  of  functions  Ty_,  is  specified  with  E_, 
normally  of  wider  bandwidth  than  T4^,  and  with  T4,  .  resulting 
in  a  controller  /C, that  robustly  stabilizes  Gi-  A  stage  will  be 
reached  (say  when  j  =  N)  where  the  bandwidth  of  the  nominal 
closed-loop  transfer  function, 


T  —  gj 

N‘i~  l  +  G.AV,' 


(3-2) 


cannot  be  increased  further  without  causing  the  effects  of  model 
uncertainties  in  Gi  to  be  loo  significant  This  occurs  when  the 
value  of 


IIV,  -  7V.IU 


2.  unstable  model  is  obtained.  (This  is  a  consequence  of  our 
simplified  control  design  method.  Appropriate  extensions  of 
the  control  design  method  [15]  allow  us  to  deal  with  this 
restriction.) 

3.  finite  control  energy. 


4  Closed-loop  System  Identification 

We  first  review  a  method  for  closed-loop  system  identification  de¬ 
veloped  by  Hansen  [1C].  Subsequently,  in  theorem  4.2,  we  demon¬ 
strate  that  with  appropriate  signal  filtering,  Hansen’s  method  pro¬ 
vides  a  suitable  framework  to  carryout  the  control-relevant  system 
identification  formulated  in  section  3.  For  the  sake  of  expository 
simplicity,  we  shall  consider  only  scalar  plants.  We  begin  with  the 
following  theoiem  [20]: 

Theorem  4.1  [f  K  ~  y  is  a  controller,  where  X  and  Y  are 
stable  proper  transfer  functions,  and  if  N  and  D  arc  stable  proper 
transfer  functions  that  satisfy  the  Bccout  identity 

NX  +  DY  =  1, 


then  the  set  of  all  plants  stabilized  by  the  controller  K  is  precisely 
(he  set  of  elements  in 


<7={ 


N  +  RY 
D-RX 


:  R  ts  a  stable  proper  transfer  function). 


Consider  the  feedback  system  shown  in  figure  4,  where  y  and 
u  are  the  measured  output  and  the  control  input,  respectively, 
e  is  an  unpredictable  white  disturbance,  and  r\  and  r*  are  user 
applied  inputs.  It  is  assumed  that  Kj,i  is  a  known  stabilizing 
controller,  G  is  unexactiv  known  and  possibly  unstable,  and,  as  is 
standard  (14],  //  is  imperfectly  known,  stable  and  inversely  stable 
The  system  identification  problem  is  to  obtain  improved  estimates 
of  G  and  ff  from  a  finite  interval  of  measured  and  known  data 
{y, u.n.r,  :0<<<r) 

Following  Hansen  (10),  we  introduce  the  stable  proper  transfer 
functions  Xj ,  Vj  ,,  Nt  , ,  and  Dt  ,  which  satisfy 
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(4.7) 


and 

NiXjj  +  DiYjj^  1. 

The  interpretation  is  that  G ;  is  a  known  but  imperfect  model  of 
the  plant  which  is  also  stabilized  by  Kj,i-  Applying  theorem  4.1 
as  shown  in  (10),  there  exist  stable  proper  transfer  functions  R%j 
and  Sij ,  with  Si  j  also  inversely  stable,  such  that 


Nx  -f  KijYj'j 
D;  -  RijXj/ 


(4.1) 


D.  -  RijXj'i 


(4.2) 


where  ft,\y  denotes  the  parametrixation  of  G  using  the  i*1  model 
and  its  associated  j,k  controller  K,i . 

As  a  result,  system  id  ntification  of  G  and  H  in  closed-loop 
is  equivalent  to  system  identification  of  t’.a  stable  proper  trans¬ 
fer  functions  U,j  and  Sij.  Using  equations  4.1  and  4.2,  we  can 
represent  the  feedback  system  as  shown  in  figure  4. 

Prom  figure  4,  we  can  write 


0  =  Rija  +  Sijt, 

(4.3) 

where 

a  =  Ay,, ‘Jr  +  Vy.iU, 

(4.4) 

and 

0  =  Diy-  Niu. 

(4.5) 

However,  ns 

«  =  KjAri  -y)  +  ri 

and 

ii 

equation  4.4  can 

be  re-written  as 

or  =  Xj,*ri  4-  V},,rj. 

(4.6) 

G. 


* 1 


N,  +  r,  ,Y,  . 
I).  -  .jX  ~ 


where  r,y  i s  a  stable  proper  transfer  function.  Also  define  the 
Jittered  output  error 


£  =  Y,.,i0-  r,jo), 


where,  with  rj  =:  0, 

o  “  n, 
fl  -  D.y-  Niu, 
rt  =:  reference  signal 
y  —  plant  output, 
u  =  control  input . 

Thus  (  u  an  error  arising  in  the  (open-loop)  identification  of  R+j 
through  an  estimate  rtJ.  Then  the  filtered  output  error  can  he 
expressed  as 


GKj'j 
1  +  CAj., 


1  +  Gi + 1  Ay;  /  l  +  GAy, 


ll>:  proof  is  not  given  due  to  space  limitations. 
Suppose  that  the  valu:  of 

GKjj  GiK,.i 

1  +  GK,  ~  1  +  GiKj.i 


(4.8) 


has  becom ;  large.  As  it  was  described  in  section  3,  we  want  a  new 
identification  of  G  via  Gi+\  for  which 


GK,.i  Gj,i  Kj. 

1  +  GAy,  1  +  Gi+\Kji 


(4.9) 


is  small.  We  are  going  to  use  the  r<j  parametrixation  of  By 

substituting  equations  4.1  and  4.7  into  expression  4.9,  and  noting 
that 


K, 


Ay,. 


Vie' 


we  can,  after  simplification,  conclude  that 


It  is  important  to  observe  from  equations  4.3,  4.5  and  4.6  that 
a  depends  on  the  applied  signals  rt  and  r*  operated  on  by  known 
stable  proper  transfer  functions  Xjj  and  YjiX  respectively,  and  p 
depends  on  measured  signals  y  and  u  operated  by  known  stable 
proper  transfer  functions  Ot-  and  A h  respectively.  Moreover,  a 
is  indep  ndent  of  the  transfer  functions  G  and  H  and  the  distur¬ 
bance  e.  Hence  the  system  identification  of  G  and  H  in  closed-loop 
has  been  recast  into  the  system  identification  of  Rij  and  Sij  in 
open-loop.  We  shall  next  state  a  result  which  is  highly  relevant 
to  the  system  identification  step  of  the  windsurfer  approach  to 
adaptive  control. 


Theorem  4.2  Let  the  controller  Kjj  stabilize  the  plant  transfer 
function  G  and  the  model  transfer  function 


a,  = 


Ni_ 

Di' 


where  Nt  and  D,  are  stable  proper  transfer  functions,  and  let 


Kid 


Xjs 

Vy,. 


where  .\J  t  and  Y}  ,  are  stable  proper  transfer  functions  satisfying 
the  Bezout  identity 


Nx Xtj  +  OtYj  t  =  1 

Let  G%+\  be  another  model  of  G ,  also  stabilized  by  !\ji  and  there¬ 
fore  hanng  a  description 


GKiti 


Gj+x  Kj.* 


1  +  GKjj  \  +  Gi+\Kj  ;| 


=  l|Vy,Ay.i(R,J-r..y)i|[> 


should  be  small. 
Remarks 


(4.10) 


•  Note  that 


Tj.i 


GKj, , 

I  +  GKj'i 


is  the  actual  clcsed-loop  transfer  function  rf  the  system,  and 


f,. 


G,I<j., 
l  +  GiK,.i 


is  the  nominal  closed-loop  transfer  function  cf  the  system. 
Therefore,  using  similar  substitutions  that  resulted  in  equa¬ 
tion  4.10,  we  can  obtain 


=  hi.*).  (4.11) 

However,  since 

Rj  i  =  0,  V;,  Vi, 

we  therefore  have 


7j,  -fy..  =  V,,. (4.12) 

By  comparing  the  argument  of  the  / norm  g,  en  in  ex¬ 
pression  4  8  -  'li  tlic  left  hand  side  of  equitior  i  12,  -  •  see 
imrnediat,  iy  tha,  when  the  value  of 
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Note  that 


1  GK^  G.  If,-.,  |j 

I  l +&{,.,  1  +  GiK;,.  1„ 

has  became  Urge;  that  is,  when  the  closed-loop  property  of 
the  actual  system  (Tj  t)  is  significantly  different  from  the 
closed- loop  property  of  the  nominal  system  (7)ti),  the  value 
of 

will  be  large. 


•  From  the  signals  defined  in  theorem  4.2,  we  observed  that 
fl,j,  the  transfer  function  to  be  identified,  is  excited  by  the 
signal  cr,  where 

a  =  Xj.ir*, 


and 


= 


_ J<lL _ 

1  +  G,A"y.,  ■ 


Since  the  nominal  closed-loop  transfer  function  of  the  system 
is 

r  •  -  GiK>< 

I  +  GiKjj  * 


we  can  write 


*/,= 


til 

Gi ' 


Therefore,  Xjj  will  hive  large  magnitude  when  we  try  to 
push  the  nominal  closed-loop  bandwidth  beyond  the  nom¬ 
inal  open-loop  bandwidth.  Since  a  model  usually  has  its 
uncertainties  become  significant  for  frequencies  beyond  its 
bandwidth,  from  figure  4,  we  see  that  if  the  spectrum  of  ri 
is  white,  we  automatically  get  the  right  weighting  for  the 
input  to  Rij  for  the  system  identification  scheme. 


•  It  is  shown  in  theorem  4.2  that  the  effect  of  e  on  (  is  g  given 
by  pygy  .---  Note  that  this  is  the  effect  of  e  on  y  attenuated 
by  the  sensitivity  function  of  the  actual  closed-loop  system. 


5  Approximate  Identification  of  the  Rij 
TVansfer  Function  for  IMC  Controller 
Design 

In  section  4,  we  have  shown  that  the  closed-loop  system  identifi¬ 
cation  of  the  plant  transfer  function  G  can  be  reformulated  into 
an  open-loop  system  identification  of  the  stable  proper  transfer 
function  Rij  that  parametrized  the  transfer  function  G  via  the 
equation 

r  Ni  + 

Dj-  RijXj.i  ■ 


fn  this  and  the  following  sections,  we  shall,  for  simplicity,  study 
the  case  where  the  plant  is  stable  and  has  no  zero*  on  the  imagi¬ 
nary  axis  of  the  s-plane,  and  where  the  IMC  method  (15]  is  used  to 
design  the  controller  JCy,,.  We  shall  also  assume  that  all  estimates 
Gi  of  the  plant  are  stable. 

If  the  model  ,, 

r  - 

Di 

is  also  stable,  we  can  let  Ni  =:  Gi  and  Di  =  1  so  that 

K,i 


G  =  Gi  + 


1- 


(5.1) 


where  Qy,;  is  a  stable  proper  transfer  function  that  parametrized 
the  controller  ,, 


Ki.i  = 


and 


Q,,  =' 


Kj.i 


1  +  G.K,, 


(5.2) 


A',,  -  Q,,. 

and 

=  1  -  Q,,G, 

Since  llic  paramelrization  of  <7  by  /t,j  depends  intimately  on  Q ,  , , 
we  shall  briefly  explain  bow  Q,  t  is  obtained  in  the  design  of  the 
controller  R}  , .  We  will  use  the  notations  ntl  and  d/i  to  denote 
the  numerator  polynomial  and  the  denominator  polynomial  of  a 
rational  transfer  function  // 

Given  a  stable  model, 


C.  = 


»c, 

do. 


where  dp,  has  no  zeros  in  the  dosed  tight  half  a-plane,  if  nc,  has 
no  zeros  on  the  imaginary  axis  of  the  »-p!ane,  we  can  write 

e-  nc.FIi(1*-*) 

- • 

where  all  z>  have  positive  real  parts,  and  nc  has  no  zero*  in  the 
closed  right  half  »-plane.  By  writing  G,  as 


Gi  =  (Ci]m(Gi)., 


where 


tc.U 


and 


_  nc,  UiW  +  «) 


dc< 


x,-*  is  the  complex-conjugate  of  x, , 


(Gi]. 


n,(A-«) 

+«)■ 

we  have  factored  Gi  as  a  product  or  its  minimum-phase  factor 
[Gi]„,  and  the  associated  all-pass  factor  (CJ..  We  cart  design  a 
controller,  using  the  internal  model  control  (IMC)  approach  (15], 
by  setting 


Qj.i  =  IGiU-'Fjj, 


(S3) 


whete  Fj  i  is  a  low  pass  filter  of  the  form 


with  n  chosen  large  enough  no  that  Qy ; is  proper,  and  Ay,,-  selected 
(possibly  on-line)  small  enough  so  that  Jfy, ;  robustly  stabilizes  G, 

In  the  ideal  situation  where  Gi  =  G  is  stable  and  minimum- 
phase,  it  follows  that  the  nominal  and  the  actual  closed-loop  trans¬ 
fer  functions  of  the  system  are  equal  and  are  given  by  the  transfer 
function  Fj.  i.  Therefore  Ay,;  is  both  the  nominal  and  actual  closed  - 
loop  system  bandwidth  with  a  -3 ndB  attenuation.  In  general, 
Gi  /  G  and  Ay,,-  serves  only  as  an  approximate  bandwidth  of  the 
actual  closed-loop  system. 

With  the  controller  designed  using  the  above  procedure,  we 
shall  now  show  that  the  transfer  function  to  be  identified,  R,j, 
is  the  product  of  a  known  stable  proper  transfer  function  and  an 
unknown  stable  strictly-proper  transfer  function.  An  analysis  of 
the  form  of  the  unknown  factor  in  Rij  indicates  how  it  can  be 
sensibly  approximated  by  a  low-order  transfer  function.  We  shall 
first  rewrite  equation  5.1  as 


Rij  = 


G-Gi 

I  +  Qy.i(G  -  Gi) 


(54) 


Then  we  can  obtain,  after  substituting  equations  5.2  and  5.3  into 
equation  5.4,  and  performing  some  algebraic  manipulations. 


Note  that  equation  5  5  can  also  be  written  as 


(5.5) 


Rij  —  ftjAj. 


(56) 
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■<’ f,  ikiii  i.  A,  ,  ,  o)ji„ 


(ii) 

is  &  known  staMc  proper  transfer  function,  ami 

ft. ,  =  -f-G''T  ~  dGnC-  (5.8) 

dKt  .<*G  4 

is  an  unknown  stable  strictly  proper  transfer  function  that  depends 
on  the  unknown  transfer  function  (7  Therefore  the  problem  of 
identifying  ft,j  has  become  one  of  identifying  its  unknown  factor 
Rij-  We  shall  summarize  this  important  result  in  the  following 
theorem. 

Theorem  5.1  Consider  a  plant  which  has  an  unirnoum  stable 
proper  transfer  function  G,  and  a  model  with  a  known  stable 
proper  transfer  function  G*.  If  G  and  G,  have  no  zeros  along 
the  imaginary  arts  of  the  s-plane,  and 

C,  =  [G.]m(C.l.. 

where  [Gi]„  is  the  minimum-phase  factor  of  G,,  and  [G,]4  is  the 
all-pass  factor  of  G, ,  then  with 

and 

where  n  ij  chosen  such  that  Qjt »  is  a  stable  proper  transfer  func¬ 
tion,  the  controller 

vc-  -  —  SiA 

Kjt'  1  -  QiaiGi 

wilt  robustly  stabilize  Gi  for  all  sufficiently  small  values  of  Xjj  >  0. 
/arfAermone,  (he  unknown  stable  strictly  proper  transfer  function 
to  be  identified, 

p  g-g. 

_  l  +  Qi..(G-G.)’ 

can  6c  factorized  as 

Rij  ~ 

where  Rij  is  an  unknown  stable  proper  transfer  function  to  be 
identified,  and  Rij  is  a  known  stable  proper  transfer  function 
given  by 

Rij  =  (Gilmer,.., 

where  rffJ  (  is  the  denominator  polynomial  of  the  fitter  Fjj. 
Remarks 

•  Note  that  the  factorization  of  Rij  given  in  theorem  5.1  is 
naturally  induced  by  the  IMC  [15]  controller  design  proce¬ 
dure  that  we  have  adopted. 

•  The  poles  of  Rij  are  the  poles  of  Tjj ,  the  actual  closed-loop 
transfer  function  of  the  system. 

•  It  is  important  to  note  that  Rij  =  0  if  and  only  if  G  =  C,-. 

•  The  order  of  Rij  is  constraint  by  the  degree  of  the  polyno¬ 
mial  df(t  td(;t  which  is  an  unknown. 

As  we  do  not  know  the  order  of  Rij  a  priori,  and  since  only 
step  response  information  is  available,  it  is  reasonable  to  employ 
a  low-order  transfer  function  for  the  approximate  identification 
of  Rij.  Since  we  arc  going  to  identify  Rij  (actually  Rij)  and 
update  Gi  to  G,+i  when  the  step  response  of  the  actual  closed- 
loop  system  exhibits  unacceptable  oscillations  and/or  overshoots, 
we  expect  Rij  to  have  complex-conjugate  poles.  Therefore,  the 
lowest  possible  order  that  we  can  assume  for  the  transfer  function 
which  serve  as  an  approximation  of  Rij  is  two. 

It  was  shown  in  equation  4.10  that  the  system  identification 
problem  is  to  find 


If  w**  dr  fine 


r-<  =  ll-jf..,-  (5  10) 

where  r ,  }  is  an  unknown  second -order  stable  strictly  proper  trans¬ 
fer  function,  then  by  substituting  equations  5  3,  5.0,  and  5  10  into 
equation  5  9,  we  can  show  that  the  system  identification  problem 
becomes  one  of  finding 

r.  ,  *rg  m»ri  |S  ft,  ,  -  ^)Jjcw  (511) 


Since  Yj  ,  is  the  nominal  sensitivity  function  of  the  closed- 
loop  system,  we  immediately  see  that  the  frequency  shaping 
in  the  identification  criterion  given  by  equation  5.11  will 
force  the  updated  model  to  have  small  modelling  error  in  the 
range  of  frequencies  where  the  nominal  sensitivity  function 
cannot  be  made  small  by  the  controller  Kjj. 

When  updating  the  model  using  the  equation 


Gt +i  —  G,  + 


1  -  rijQi.i ' 


the  order  of  the  model  m»y  increase.  To  prevent  the  model 
order  from  increasing  indefinitely,  we  use  s  frequency  weighted 
balanced  truncation  scheme  to  reduce  the  order  of  C<+i- 
Specifically,  we  find 

where  Gi+i  is  the  reduced  order  model.  If  the  model  order 
is  restric'.-d  to  m,  the  controller  will  be  at  most  of  order  2m 
(see  controller  design  equations  given  in  theorem  5.1).  In 
this  way  the  controller  complexity  will  be  limited. 


6  Simulation  Results 

We  shall  present  some  simulation  results  of  applying  the  wind¬ 
surfer  approach  to  the  control  of  a  plant  with  the  transfer  function 


'  (s+  l)(a’  +  0.06.  +  9) 

We  first  summarize  the  procedure  in  the  following  algorithm: 
Step  1: 

Set  G,  =  Go,  where  Go  is  the  transfer  function  of  an 
initial  model  of  the  plant 


Step  2: 


Factorize  C,  as 


C.  =  (G.UGil., 


where  (G,]m  is  the  minimum-phase  factor  of  G*,  and 
{Gi).  is  the  associated  all-pass  factor  of  G%. 


Stop  3: 


For  j  -  l ,  find 


k  = 

'•  l+Q/.G.' 


=  [G.U-'F,... 

where  the  positive  integer  n  and  the  parameter  A;>,  in 
the  transfer  function 
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.ire  chosen  such  that  Qt  ,  is  a  stable  proper  Iran  sic  r 
function,  and  tCjj  robustly  stabilizes  G ,  in  the  sense 
that  the  step  response  of  the  actual  closed-loop  ays 
tem  has,  at  most,  little  oscillations  and/or  overshoots. 
Stop  here  if  such  a  robust  stabilizing  controller  cannot 
be  found.  Also  stop  here  if  the  robust  stabilizing  con¬ 
troller  results  in  a  closed-loop  system  which  meets  the 
specified  bandwidth.  Otherwise,  proceed  to  the  next 
step 

Stop  4; 

Let  j  =  j  +  1  and  set  AJ>(  =  ,  +  i  for  small  <  >  0, 

and  redesign  the  controller  Kj  i  using  the  equations 
given  in  Step3.  Stop  here  if  the  design  produces  a  ro¬ 
bust  stabilizing  controller  with  the  closed-loop  system 
satisfying  the  specified  bandwidth.  Otherwise,  repeat 
this  step  if  robustly  stabilizes  G»\  else  proceed  to 
the  next  step. 

Step  5: 

Perform  rational  function  approximation  to  obtain 
nj  =  wg  mj  n  |1  A/  Vy  ,<  (  R,  j  -  ^)||„. 


Then  update  the  model  using  the  following  *et  of  equa¬ 
tions:  _ 

Rij  —  (Gtlmdf,.,, 


end 


Uj  =  Rijrtj, 
rU 


Ci+i  =Cj  + 


l  -  rtjQj.i 

Step  6; 

If  Cfj+»  is  stable,  find  the  reduced  order  model 

=  .1  Gi+iKjj  nKj.i  | 

,+1  "  "*nVnIl+Ci+,/fj,i  1  + 


Otherwise,  stop  here. 

Step  7: 

Set  G;  =  C;+i  »nd  return  to  Step  2. 


Remarks 

•  In  the  algorithm,  rational  function  approximation  has  to  be 
carried  out  when  \\T»,i  -7V.i|lo°  is  no  longer  small.  Broadly 
speaking,  this  will  correspond  to  a  significant  difference  be¬ 
tween  the  designed  nominal  performance  (depending  on  Gj 
and  I(ri,i)  and  the  actual  performance  (depending  on  G  and 
Kn,i).  In  particular,  the  observed  step  response  may  exhibit 
much  more  oscillations  and/or  overshoots  than  the  designed 
values.  This  is  not  of  course  the  same  thing  as  guaranteeing 
that  the  Hat  error  above  has  became  targe,  but  neither  is  it 
unrelated. 

•  To  be  more  precise,  we  define  the  peak  gain  of  a  system, 
whose  transfer  function  is  T,  by 


imi,  = 


sup 


ML  ' 


This  is  also  equal  to  the  fotal  uanaiton  of  the  system's  unit 
step  response  (4|  defined  as  the  sum  of  ill  consecutive  peak- 
to-valley  differences  in  the  unit  step  response  It  can  be 
shown  (5)  that,  if  T  is  a  stable  strictly  proper  transfer  func¬ 
tion. 

imu  <imi.  <2"Iitil. 

where  n  is  the  order  of  the  transfer  function  T.  Now  we 
consider  the  peak  error 


Since 

l7V.-7w.1i,  >11^,11,  -  iiVW.il,  . 

therefore,  if  the  observed  step  response  of  7a,  exhibits  much 
more  oscillations  and/or  overshoots  than  the  designed  step 
response  of  7'/v., ,  we  would  expect 

PV.I1.>  17V.I,. 

and  hence, 

|7V,  -  7V.II,  >  r.  r  >  0 

Since  the  peak  gain  also  provides  a  hose  tower  bound  for 
the  //oo  g*in.  **  likely  that 

i|7V.  -  7V.|L 

becomes  large  when  the  obaervtd  actual  step  response  ex¬ 
hibit*  much  more  oscillations  and/or  overshoot*  than  the 
desired  one. 

•  This  explains  why,  in  the  simulation,  the  models  are  updated 
whenever  the  actual  step  response  exhibits  *»«cceyf«4!e  os¬ 
cillations  and/or  overshoots. 

The  simulation  results  axe  presented  in  figure  S  and  figure  8. 
These  figures  correspond  respectively  to  the  following  case  studies: 

•  Case  1:  the  initial  model  is  C«(t)  = 

•  Csse  2:  the  initial  model  is  Go(<)  = 

We  present  unit  step  responses  at  vrious  steps  in  the  system 
identification/coutrol  design  iteration,  and  frequency  responses 
achieved  just  before  the  iteration  process  is  stopped. 

In  the  first  case  study,  see  figure  S,  the  bandwidth  of  the  closed- 
loop  system  cannot  be  increased  beyond  10  rad/sec  because  we 
have  stopped  the  iterative  system  identification  and  control  design 
process  when  an  unstable  model  is  obtained.  Note  that  only  two 
model  updates,  G >  and  Gy,  are  required  in  the  process,  and  the 
results  are  sufficiently  good  for  most  practical  purposes 

The  results  for  the  second  case  study  are  given  in  fijnr*  8 
These  results  show  that  the  closed-ljop  bandwidth  can  easily  be 
pushed  to  10  rad/sec  with  very  good  step  responses.  Note  that  in 
this  case,  the  model  has  to  be  updated  only  once. 

Ibemark 

•  We  must  emphasize  that  in  these  simulations,  instead  of 
performing  a  system  identification  using  input-output  mea¬ 
surements,  we  actually  perform  the  model  approximation 

?ij  =  argmjn||Ajl,*yj,j(ft,  j  -  d)IU, 

where  is  obtained  from  the  known  G  The  reasons  for 
doing  this  are: 

1.  Our  results,  although  preliminary,  serve  as  a  bench¬ 
mark  in  the  sense  that  using  the  transfer  function  G 
corresponds  to  performing  system  identification  with 
an  infinite  number  of  noiseless  measurements. 

2.  We  like  to  know  how  serious  the  problems  may  be  due 
to  employing  a  low-order  approximation  for  rtJ  This 
is  important  for  later  system  identification  studies. 

3.  We  are,  at  this  stage,  more  concerned  with  the  concept 
of  iterative  system  identification  and  control  design  as 
applied  to  adaptive  robust  control,  rather  than  the  de¬ 
tails. 

4.  Efficient  algorithms  for  performing  //«*,  system  identi¬ 
fication  are  si  ill  lacking,  and  the  corresponding  theory 
is  still  not  well  understood  {11,  16,  17). 
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7  Discussions  and  Conclusions 

We  have  reviewed  in  section  I  the  strength  end  weakness  of  both 
the  traditional  adaptive  control  and  the  robust  control  design 
methods.  These  methods  should  be  able  to  complement  each  other 
and  there  should  be  natural  ways  in  which  they  could  be  blended 
harmoniously.  We  proposed  that  one  of  the  possible  ways  is  by 
the  windsurfer  approach,  which  was  first  mentioned  in  (2].  We 
have  shown,  by  simulation,  tliat  by  starting  with  a  (crude)  initial 
model  of  the  plant  and  a  (small  bandwidth)  robustly  stabilising 
controller,  the  bandwidth  of  the  closed-loop  system  can  be  in¬ 
creased  progressively  through  an  iterative  control-relevant  aystem 
identification  and  control  design  procedure.  We  shall  highlight 
the  following  points  which  we  believe  are  reasons  for  the  success 
of  the  approach: 

•  The  use  of  control-relevant  frequency  weighting  in  the  sys¬ 
tem  identification  criterion. 

•  Updating  of  the  mode!  when  its  effects  is  no  longer  small  in 
the  closed-loop  response.  This  will  ensure  that  model  uncer¬ 
tainties  are  emphasised  in  the  correct  range  of  frequencies. 

•  The  controller  designed  by  using  the  IMC  method  always 
has  integral  action.  Therefore  it  is  insensitive  to  model  un¬ 
certainties  at  low  frequencies,  provided  the  gain  of  the  model 
at  low  frequencies  is  of  the  right  sign. 

•  The  controller  designed  by  using  the  IMC  method  induces  a 
natural  factorization  in  the  parametrization  of  the  unknown 
transfer  function  of  the  plant.  This  enable  the  aystem  iden¬ 
tification  problem  to  be  solved  effectively. 

In  conclusion,  we  would  like  to  emphasize  that  only  the  case 
of  stable  plant  and  model  is  considered  in  this  preliminary  study. 
We  will  like  to  address  the  following  problems  in  the  near  future: 

•  The  extension  of  the  method  to  deal  with  unstable  plant 
and  model. 

•  Use  of  orthogonalized  exponentials  in  the  system  identifica¬ 
tion  procedure  such  that  it  becomes  a  convex  optimization 
problem 

•  To  prove  that  the  algorithm  actually  converges  in  some  sense. 

•  To  study  other  control  design  methods  in  the  context  of  the 
windsurfer  approach. 
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Figure  2:  Closed -loop  system 


Figure  3:  Closed-loop  system  identification 


Fi{ure  *:  Excitation  of  Rij 
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Adaptive  Robust  Control:  On-Line  Learning 

Brian  D.  O.  Anderson  *  Robert  L.  Kosut  * 


Abstract  A  method  of  on-line  adaptation  and  learning  is  pro¬ 
posed  which  makes  use  of  a  probing  signal  whose  frequency  content 
is  concentrated  at  the  bandwidth  of  the  current  controller.  As  the 
plant  is  learned  the  procedure  naturally  increases  the  learning  band¬ 
width. 


1  Introduction 

It  is  very  easy  to  construct  an  adaptive  system:  just  connect  a 
controller  design  rule  and  a  model  parameter  estimator  together. 
This  kind  of  adaptive  control  system  operates  along  roughly  the 
following  lines.  A  model  for  the  unknown  plant  is  assumed  in  which 
everything  is  known  but  the  values  of  a  finite  number  of  parameters. 
These  parameters  have  the  property  that  when  they  are  known,  the 
controller  can  he  defined.  It  too  has  a  finite  number  of  adjustable 
parameters,  the  values  of  which  depend  on  the  plant  parameters. 
By  observing  the  plant  input  and  output,  the  plant  parameters 
are  learned  and/or  tracked,  and  the  controller  parameters  are  then 
set  according  to  some  design  rule.  Sometimes  it  is  the  controller 
parameters  which  are  learned  directly.  Certain  choices  of  controller 
par ametri ration  lends  itself  to  this  approach,  others  do  not. 

What  is  absent  in  this  approach  is  the  recognition  that  the  es¬ 
timated  plant  parametric  model  during  the  learning  phase  can  be 
a  poor  representation  of  the  true  plant.  This  mismatch  between 
the  plant  and  the  estimated  model  can  cause  poor  performance  via 
such  phenomena  as  parameter  drifting  and  bursting.  All  of  this  has 
been  reported  in  the  literature  and  under  certain  conditions  has 
been  analyzed  and  explained,  [l],  121- 

In  this  paper  we  invoke  a  different  design  philosophy  than  that 
expressed  by  the  previous  reasoning.  The  new  reasoning  would  have 
to  recognize  at  the  outset  that  the  true  plant  can  differ  greatly  from 
the  estimated  model  at  any  one  time,  particularly  during  the  initial 
learning  stage. 

Nature  provides  examples  of  this  kind  of  adaptive  control,  and 
it  seems  that  many  such  examples  do  not  exhibit  the  traditional 
operating  strategy.  In  particular,  consider  how  humans  learn  wind¬ 
surfing,  where  the  human  is  the  adaptive  controller.  Several  obser¬ 
vations  can  be  made:  (1)  The  problem  has  multiple  inputs.  (2)  The 
human  first  learns  to  control  over  a  limited  bandwidth,  and  learning 
pushes  out  the  bandwidth.  (3)  The  human  first  implements  a  low 
gain  controller;  and  learning  causes  the  loops  to  be  tightened  (this  is 
linked  with  2).  These  observations  suggest  that  one  could  contem¬ 
plate  an  adaptive  controller  based  on  learning  a  frequency  domain 
description  of  the  plant,  with  the  learning  process  pushing  out  the 
bandwidth  over  which  the  plant  was  accurately  known.  For  such  a 
concept  to  be  valid  and  consistent  with  point  3  above,  it  would  be 
necessary  to  demonstrate,  at  least  for  a  broad  class  of  plants,  that  a 
low  gain  controller  can  be  contemplated  for  plants  with  significant 
uncertainty  at  high  frequencies,  and  that  reduction  in  the  struc¬ 
tured  uncertainty  progressively  allow  increase  of  the  controller  gain 
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and  control  over  an  increasing  frequency  band;  this  is  essentially  a 
linear  systems,  as  opposed  to  adaptive  systems,  exercise. 

It  would  also  be  desirable  to  show  that  when  the  behaviour  of 
the  plant  over  a  certain  bandwidth  had  been  learned  and  certain 
controller  gains  implemented,  it  would  be  natural  to  apply  a  probing 
signal  at  the  upper  limit  of  this  bandwidth  (perhaps  in  handling 
transients)  so  that  the  bandwidth  of  knowledge  of  the  plant  was 
expanded. 


2  Closed-Loop  Identification 

For  the  sake  of  expository  simplicity,  we  shall  restrict  attention  to 
wslsr  plants.  The  following  result  can  be  found  in  one  form  or 
another  in  [9]  and  the  references  therein. 

Theorem  1  Suppose  that  X,Y,N,D  an  stable  transfer  Junctions 
satisfying 

XN  +  YD  =  l  (1) 

Then: 

(i)  All  controllers  C  which  stabilize  the  plant  P  =  N / D  are  in 
the  set  of  transfer  functions  , 

(it)  Alt  plants  P  stabilized  bf  the  controller  C  ss  XfY  arc  in  the 
set  of  transfer  functions  , 

{N+RY-:  R.uUt\  (3) 

l  D-RX  5  '  ' 

Since  all  rational  transfer  functions  can  be  expressed  as  a  ratio  of 
■table  transfer  functions ,  it  follows  that  part  (i)  gives  a  parametriza- 
tion  of  all  stabilizing  rational  controllers  of  rational  plants. 

Statement  (ii),  which  follows  directly  from  (i)  by  interchanging 
the  plant  and  controller,  was  developed  in  [3,  4]  for  use  in  closed- 
loop  identification  for  the  problem  of  experiment  design.  Similar 
results  are  also  in  (8).  In  this  paper  we  also  utilize  this  result,  but 
for  a  slightly  different  purpose. 

Consider  the  feedback  system, 

y  =  Gu  +  He  (4) 

u  ss  Kb(n  -  v)  +  r2  (5) 

where  (y,  u)  are  the  measured  output  and  control  input,  respec¬ 
tively,  e  is  an  unpredictable  disturbance,  and  (ri.rj)  are  user  ap¬ 
plied  inputs.  It  is  assumed  that  is  a  stabilizing  feedback  com¬ 
pensator.  This  implies  some  knowlwdge  of  G,  but  otherwise  G  and 
H  are  assumed  unknown.  The  plant  is  the  pair  (G,  H)  where  G  is 
possibly  unstable  and,  as  is  standard,  H  and  H~l  are  stable  [6). 
The  identification  problem  is  to  obtain  estimates  of  (G,  If)  from  a 
finite  set  of  measured  and  known  data  (y, «,ri,rj  :  0<  t  <  T). 
Following  identification,  the  controller  is  to  be  re-designed  to  im¬ 
prove  performance  of  the  closed-loop  system. 
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Stable  Plant  Lei  us  consider  the  special  case  when  the  plant 
G  is  stable.  Suppose  also  that  Go  is  stable  and  that  Ko  stabilizes 
Go-  Then,  by  Theorem  1,  it  can  be  shown  that  Ko  stabilizes  G  iff 
there  exists  a  stable  R  and  stable  mini-phase  S,  such  that 


G 


Go  -4- 


R 

1  —  RQo ' 


H  = 


S 

1  -  RQo 


(C) 


where 


Q  o  = 


Ko 


(7) 


1  *1*  Go  Ko 

Again,  an  interpretation  is  that  Ko  stabilizes  all  plants  in  the  set 

R 


Gq  + 


-RQo 


:  R  stable 


(8) 


As  result,  identification  of  (G,  H)  in  closed-loop  is  equivalent  to 


identification  of  the  stable 

open-loop  (R,  5)-system, 

P  —  Ra  -f  Sc 

(8) 

where  P,  a  are  given  by 

P  = 

y  —  Go« 

(10) 

or  = 

Vo>l  +  (1  —  QoGo)rj 

(11) 

Cb'"*ve  that  (a, iff)  depend  on  measured  and  applied  signals 
(y,u,  ri.rj)  operated  on  by  known  stable  systems  (Gq.Qo)- 


Example  To  further  motivate  identifying  the  (R,S)-system, 
consider  the  following  example: 

_ 9 _ 

(s  +  l)(s2  +  .06s  +  9) 

X 

•r+1 

(*  + 

Figure  1  shows  the  magnitude  of  P.  and  G  —  Go  vs.  frequency. 
These  are  very  close  showing  that  identification  of  R  is  close  to 
identification  of  the  model  error  G  —  Go . 


G  = 
G0  = 
Qo  = 


G  -  Update  G  =  G0  +  - ; - 

l  -  RQo 

ControllerDcsign  Q  =  argmin||//d,,,r,d  -  G'Q|| 
Repeat 

Although  we  can  not  offer  any  proof  at  this  time,  we  believe  that 
this  iterative  procedure  provides  a  natural  approach  to  learning  by 
gradually  increasing  the  bandwidth  of  the  controller.  The  essential 
features  fall  out  of  the  fractional  representation  theory,  in  particular 
via  the  transformation  from  the  (G,  H)  system  in  closed-loop  to  the 
(R,  S)-system  in  open-loop,  and  subsequent  identification  of  the 
(R,  S)  system  to  obtain  estimates  of  (G,  H). 
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Figure  1:  Magnitude  plots  of  R  and  G  —  Go  vs.  frequency. 


Thus,  we  are  led  to  the  following  iterative  identification  algorithm 
for  stable  plants  in  closed-loop.  A  similar  formulation  is  available 
for  the  general  case  where  the  plant  is  possibly  unstable. 


Koo 


Initialize:  G  =  Goo,  Q  =  Qon  = 

1  -f  Goo  K oo 

Update  G0  -G,  Qo  =  Q,  K0  =  Qo 
Identification  input : 


R  -  Update  fl  =  Arg  xnin  ||y  —  Oqu 


I  —  QoGo 
u  =  K'o(ri  -  y)  +  r2 

R(QoM  +(1 


QoCo)r?)|| 
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